Performance penalty from inheritance

Posted by Thomas Mercer-Hursh on 06-Jul-2010 14:04

In a comment on OE Hive on this node http://www.oehive.org/node/1793 Guillaume Morin mentioned the performance penalty from inheritance.  This surprised me, so I asked for test code, which he provided.

In his test he has two class hierarchies.  One has 5 private data members and five methods in the base class and no inheritance or imterfaces.  The other has one of the members in the leaf class, one in its parent, and three in the grandparent.  The five methods are all in the leaf class and reference interfaces.  The constructor for the child takes five arguments, calls a super passing up 4 values and assigns one to its member.  The class above it passes three in the super and assigns one.  And the grandparent assigns the remaining three values.

Running a loop that creates 10000 of each of the first type and then 10000 of the leaf class of the second type on my machine (with no tuning etc) give 30.081s for the version without inheritance and 66.324s for the version with inheritance.

I thought that the chained supers might be the issue so I created a second version.  In mine, I eliminated the methods and the interfaces, preferring to test that separately.  I changed all the private data members to public properties and eliminated all of the logic in the constructors.  The assignment of values to the properties was made in a separate assign statement following the NEW.

That gave me 14.789s for the version with no inheritance and 47.619s for the version with inheritance.

What's up???  The two leaf classes are functionally equivalent and the same logic is being used internally with both and externally with both.  The difference in definition should be resolved at compile time.  But, here we are seeing a 3X performance penalty!!!!

All Replies

Posted by Thomas Mercer-Hursh on 06-Jul-2010 14:29

Another data point .... I just reran the test, but eliminated all the actual assignments.  The numbers were essentially the same.  So the big performance penaly is coming in the NEW, not in the access.  I suppose this is less disturbing since it only happens once per object, but this seems like an extreme variation.

Posted by Peter Judge on 06-Jul-2010 15:12

tamhas wrote:

Another data point .... I just reran the test, but eliminated all the actual assignments.  The numbers were essentially the same.  So the big performance penaly is coming in the NEW, not in the access.  I suppose this is less disturbing since it only happens once per object, but this seems like an extreme variation.

At the risk of asking the obvious, what version are you running? And can you attach your test code?

-- peter

Posted by Thomas Mercer-Hursh on 06-Jul-2010 15:39

10.2B

Here is Guillaume's original

Posted by Thomas Mercer-Hursh on 06-Jul-2010 15:41

And here is mine

Posted by gus on 07-Jul-2010 13:15

In one case, 10,000 constructors are executed and in the other, 30,000 constructors are executed.

Posted by Thomas Mercer-Hursh on 07-Jul-2010 13:20

Are you saying that executing an empty constructor is the issue?  And that one actually has 30,000 objects instead of 10,000?

Posted by gus on 07-Jul-2010 15:29

No, there are not 30,000 objects, there are 10,000.

I understood your inheritance hiearchy to have 3 levels: leaf, parent, and

grandparent. So there are 30,000 constructors, 3 per object.

When you create a leaf object, its constructor is run. The leaf object class

constructor runs the parent class constructor first, then the stuff you put

in it. The parent constructor runs the grandparent constructor, etc. all the

way up to the top of the hierarchy.

Same with destructors.

--

regards,

gus bjorklund, progress software

If we wish to count lines of code, we should not regard them as lines

produced but as lines spent. (Edsger Dijkstra)

Posted by Thomas Mercer-Hursh on 07-Jul-2010 15:45

And you are going to run all these constructors even though they are all empty?  And running the empty constructors is going to triple the time it takes to create the object?

Posted by gus on 07-Jul-2010 16:09

Perhaps I misunderstood. I thought you have constructors. I did not examine

your code nor run it. I only looked at the description of what was being

done. I will take a look.

You didn't think inheritance was free did you?

Even if you have not written explicit constructors, there is still stuff

that has to be done at the point where your constructor code would be

executed. Data members and properties have to be allocated and initialised.

Inherited constructors have to be run. I don't know exactly what the r-code

has in it but will find out.

--

regards,

gus bjorklund, progress software

If we wish to count lines of code, we should not regard them as lines

produced but as lines spent. (Edsger Dijkstra)

Posted by Thomas Mercer-Hursh on 07-Jul-2010 16:25

Each has a constructor, but the constructors are all empty.  They are all just

constructor ClassA():

end.

Yes, the public properties in those superclasses obviously have to get instantiated, but there are exactly the same number of total properties in the two cases.  I.e., exactly the same amount of work needs to be done.  And yet, the inheritance version is taking much longer to do that work.

I wouldn't have minded a small difference, but 3X seems like something is wrong.  Why would this not be something which was resolved at compile time?  Why would the leaf class in the inheritance hierarchy not be optimized to be identical in R-code to the case with no inheritance?  At runtime, there is no significance to the property coming from a parent.

Posted by Matt Baker on 07-Jul-2010 18:00

For giggles I ran this on my home pc (the code from TMH's zip file). -l 50000 -mmax 65534 and everything run from .r code.

With -q and with empty constructors:

635
1539

With -q and without constructors:

566
1505

Without -q and with empty constructors

1181
2892

Without -q and without empty constructors

1102
2871

There is a bit of overhead from just having the empty constructors  there, but it doesn't really add much.  -q has the greatest affect.

Posted by guilmori on 08-Jul-2010 08:40

What is the difference ? This still shows the same ratio between with and without inheritance.

Posted by Matt Baker on 08-Jul-2010 08:49

You are right.  My point was to make it clear that the empty constructors don't really make a difference.

Posted by Shelley Chase on 08-Jul-2010 09:34

Hi Thomas,

We of course are always looking for ways to improve performance and class instantiation is one of those currently being investigated. That being said it was a design decision with OO to keep each class as a separate r-code file. The benefits are numerous, mostly around being able to change a class in a hierarchy without having to recompile. This is some of the beauty of the ABL where you can put a .p earlier on your propath to change behavior. To some extent you can do that in OO ABL. Java follows the same model where of course .NET tightly couples the hierarchy.

So at compile time we look at the hierarchy and set up our dispatch table as appropriate and we keep a "digest" value for each r-code used. If at runtime the digest's match, we can quickly use the dispatch table (this is what happens with -q). If not, we need to redo the dispatch table at runtime which is something to avoid if possible.

As far as runtime class instantiation, each class is run, the setup block is run (like block 0 for procedures) and then the contrsuctor is run which must immediatelty instatiation the super class and the same happens up the hierarchy. This is necessary with strong-typing. As I said we are looking at ways to improve this like maybe keep pools of object instances for reuse, etc.

Let me know any suggestions/comments you might have.

-Shelley

Posted by gus on 08-Jul-2010 10:39

Your measurements don't include running the destructors (I know you don't

have any, but you could have done) which is another bit of additional

overhead.

When you inherit, you can have 30,000 destructors. Without inheritance, you

have 10,000 destructors.

There are a variety of optimisations (oo related and otherwise) that could

be implemented and we are working on some of them.

--

regards,

gus bjorklund, progress software

If we wish to count lines of code, we should not regard them as lines

produced but as lines spent. (Edsger Dijkstra)

Posted by Thomas Mercer-Hursh on 08-Jul-2010 12:32

So, yes, appropriate parameters speeds up the whole process, but the contrast between the inherited version and the non-inherited version is still there and nearly as substantial.

My point is that there is only one object being instantiated in each case ... or, should be only one.  The inheritance should be resolved at compile time into a single resulting object.  In this case, the instantiate objects should be identical.  So, why is one taking 2-3 times longer?

Posted by Thomas Mercer-Hursh on 08-Jul-2010 12:45

So, we know we can't blame it on empty constructors ... except that logically there is a constructor whether it is empty or not.  But, having this have a performance impact is only sensible if the AVM is constructing 30000 objects instead of 10,000.

Posted by Thomas Mercer-Hursh on 08-Jul-2010 12:50

Yipe.  So you really are instantiating 30,000 objects in the second case.  Ouch.

Posted by gus on 08-Jul-2010 13:14

I told you --- after the object itself is created, the constructors are run

for each class in the hierarchy.

While the code for the constructors could, in theory, be elided into a

single constructor at your leaf level, in fact they are not.

Furthermore, if you do not write a constructor, a default constructor which

calls the superclass constructor is generated in the r-code. We are going to

eliminate that.

--

regards,

gus bjorklund, progress software

If we wish to count lines of code, we should not regard them as lines

produced but as lines spent. (Edsger Dijkstra)

Posted by gus on 08-Jul-2010 13:14

So, we know we can't blame it on empty constructors ... except that logically

how do you know that?

there is a constructor whether it is empty or not.  But, having this have a

performance impact is only sensible if the AVM is constructing 30000 objects

instead of 10,000.

No, there aren't 30,000 objects.

--

regards,

gus bjorklund, progress software

If we wish to count lines of code, we should not regard them as lines

produced but as lines spent. (Edsger Dijkstra)

Posted by Thomas Mercer-Hursh on 08-Jul-2010 13:54

No, there aren't 30,000 objects.

Doesn't Shelley's explanation contradict this?

Posted by Tim Kuehn on 08-Jul-2010 14:23

schase wrote:

As far as runtime class instantiation, each class is run, the setup block is run (like block 0 for procedures) and then the contrsuctor is run which must immediatelty instatiation the super class and the same happens up the hierarchy. This is necessary with strong-typing. As I said we are looking at ways to improve this like maybe keep pools of object instances for reuse, etc.

Let me know any suggestions/comments you might have.

Is the AVM engaging in all the overhead that goes with a normal RUN statement when it's running a constructor?

Posted by Evan Bleicher on 08-Jul-2010 15:06

Yes, the Language team is investigating changes to improve performance of class instantiation.  This is part of an ongoing effort.  For example, Gus noted in this thread, that we are looking at making improvements to class instantiation when the application does not provide a default constructor.

Posted by gus on 08-Jul-2010 15:44

No, there aren't 30,000 objects.

Doesn't Shelley's explanation contradict this?

I see nothing in Shelley's explanation that contradicts this. Nor anything

that supports it.

Either way, there are 30,000 constructors to be executed.

--

regards,

gus bjorklund, progress software

If we wish to count lines of code, we should not regard them as lines

produced but as lines spent. (Edsger Dijkstra)

Posted by Tim Kuehn on 08-Jul-2010 15:58

gus wrote:

Either way, there are 30,000 constructors to be executed.

And imagine what would happen if 30% of those class instances had a TT definition. 

Posted by Tim Kuehn on 08-Jul-2010 16:00

bleicher wrote:

Yes, the Language team is investigating changes to improve performance of class instantiation.  This is part of an ongoing effort.  For example, Gus noted in this thread, that we are looking at making improvements to class instantiation when the application does not provide a default constructor.

Doesn't the AVM know that the target's an object instance?

Posted by Admin on 08-Jul-2010 16:04

And imagine what would happen if 30% of those class instances had a TT definition. 

With or without REFERENCE-ONLY

Posted by Thomas Mercer-Hursh on 08-Jul-2010 16:10

That being said it was a design decision with OO to keep each class  as a separate r-code file.

I take this to mean that if there is a class hierarchy in which the superclasses are compilable, i.e., not abstract, that there would be a separate r code file for each.  I don't have a problem with that, even though good OO suggests that one should never be instantiating one of those superclasses on its own.

So  at compile time we look at the hierarchy and set up our dispatch table  as appropriate and we keep a "digest" value for each r-code used.

Three classes per used, three digest values.  No?

As far as  runtime class instantiation, each class is run, the setup block is run  (like block 0 for procedures) and then the contrsuctor is run which must  immediatelty instatiation the super class and the same happens up the  hierarchy.

So, this seems to indicate that all classes in the hierarchy are going to be instantiated.  That seems to point to 30,000 classes to me.

This certainly would be a strong disincentive for those people who like to stick superclasses over the top of everything "just in case".

Posted by Admin on 08-Jul-2010 16:20

I don't have a problem with that, even though good OO suggests that one should never be instantiating one of those superclasses on its own.

Thomas, for the sake of understanding what's going on in the runtime, wouldn't it be wise to keep lectures about good OO design beside for now. If the super class is abstract or not is probably irrelevant right now because the abstract base class can also have constructors.

Posted by Thomas Mercer-Hursh on 08-Jul-2010 16:36

Point being that it is a very strong OO design principle that one only ever instantiates the leaf classes in a hierarchy.  If that is done, then one doesn't even need R-code of the superclasses, unless the R-code is used in the compileation of the leaf class.  While I appreciate the concept of flexability implied by the current approach, it seems inherently wrong to instantiate the superclasses.  It is really a very different thing to stick an alternate .p into a PROPATH and quite a different one to switch the classes in an inheritance hierarchy.  Seems like something that one would actively not allow.  Even if one thought there was a good reason to instantiate a superclass in some circumstance, I would instantiate it there only, not everywhere the subclass was referenced.

Posted by Admin on 08-Jul-2010 16:54

Point being that it is a very strong OO design principle that one only ever instantiates the leaf classes in a hierarchy.  If that is done, then one doesn't even need R-code of the superclasses, unless the R-code is used in the compileation of the leaf class. 

So you'd rather compile the R-Code of every base class into every leaf class? Sounds a lot like compiling include files into one R-Code. Reminds me terribly how large R-Code went with ADM1 applications.

I'd rather get back to looking into compiling directly into .PL files and provide a stricter control about the class hierarchy there and compiled bits here. Rather than duplicating the R-code in every leaf class.

Posted by Thomas Mercer-Hursh on 08-Jul-2010 17:19

So you'd rather compile the R-Code of every base class into every leaf  class?

Well, yes.  It would mean a somewhat larger disk footprint, but that is hardly an issue.  If all it did was to glom together the R-code of the hierarchy, it would mean exactly the same memory footprint we have now, exactly the same volume of code to load as now, and only one object to instantiate.  And, onw would hope that a tiny bit of optimization would occur such that, for example, three empty constructors would not result in three nested calls.

I don't see how compiling into PL files is going to change things.  It seems to me that the overhead here is doing the initialization setup on three objects when only one of those is going to be directly executed.

Posted by Admin on 09-Jul-2010 00:55

tamhas schrieb:

So you'd rather compile the R-Code of every base class into every leaf  class?

Well, yes.  It would mean a somewhat larger disk footprint, but that is hardly an issue.  If all it did was to glom together the R-code of the hierarchy, it would mean exactly the same memory footprint we have now, exactly the same volume of code to load as now, and only one object to instantiate.  And, onw would hope that a tiny bit of optimization would occur such that, for example, three empty constructors would not result in three nested calls.

I don't see how compiling into PL files is going to change things.  It seems to me that the overhead here is doing the initialization setup on three objects when only one of those is going to be directly executed.

Well, the last time I've seen an ADM1 application with huge include files compiled in any .r file it was 700 MB R-Code. Similar sided ADM2 application probably would be around 100 MB R-Code. That IS significant to me. I know that modern architectures look different. But some of our base classes have R-Code sizes of 300 MB. You wanna tell me there's a benefit in compiling 300 MB into 1000 leaf classes (it's a UI class and inheriting from it 1000 times is more than realisitic)?

It's not just about execution of code (let it be constructors or methods). Performance is also about loading R-Code. You can certainly cache a lot, but wasting precious memory for duplicated R-Code blocks, is still a waste. Adn so there would be a huge performance penalty for people that are actually using many different leaf classes of the sames base class. Only when the whole R-Code is loaded, Progress can start optimizing the number of constructor blocks executed. But for loading it's a huge difference if I'm loading two leaf classes with 20 MB each = 300 + 20 + 20 or two leaf classes containing the R-Code of the base: 320 + 320 MB. And so I believe the current architecture is designed to perform well. What's missing is a bit more control.

In a previous thread we had a discussion about different strategies for .PL files (mainly for library level protection). But a .PL that is more acting like a .NET Assembly rather than just a deployment vehicle could certainly optimize things by ensuring that the R-Code of the root class and the leaf class are in sync and whenever the root class changes the whole assembly requires a rebuild.

Nachricht geändert durch Mike Fechner

Posted by jmls on 09-Jul-2010 01:19

Wow. I have to be doing something wrong. My biggest class .r is 187K

300MB ? Are you sure ?

Julian

Posted by Admin on 09-Jul-2010 01:24

300MB ? Are you sure ?

Close to that, yes. But so it's simpler to calculate.

Posted by jmls on 09-Jul-2010 01:34

wow.

Julian

Posted by Thomas Mercer-Hursh on 09-Jul-2010 11:43

I'm with Julian here.  I could imagine 300KB, but not 300MB.

But, whatever the size is, let's consider a couple of things.

First, if there is a class hierarchy of A->B->C, whatever size each of those might be, using the current approach all three are going to be loaded into memory anyway.  That can't possibly be any smaller than if a combined R-code is created for A that includes all three and it is likely to be smaller if any optimization is done.

Second, while one hopes that there will only be one copy of the R-code either way, the data portion of all three can't possibly be smaller than the combined data portion and it is involving three redirects to find all of it instead of just one.

Third, are you really going to define a C class inherited by 1000 sub objects with 300MB of code in it?  When people do things like define a Business Entity superclass inherited by all Business Entity classes, that super class only contains a small handful of common members ... nothing else would make sense.

Agreed that if you have 1000 business entity classes all inheriting from some BE superclass and load all 1000 of them into the same session at the same time, then you will have 1000 copies of that common code instead of one ... but how realistic is that?  I would think it unlikely that any one session would be handling more than a handful of BE types simultaneously.  I just don't see there being much likelihood for huge amounts of duplicated code.

So, for a small incremental memory use for that common code, one drastically reduces the total number of objects and the total redirections and gains the potential of optimizing out empty supers and the like.

I just don't see how a PL really impacts this.  While I was a big fan of sticking a variant .p in front of the standard .p on a PROPATH, the idea of sticking an alternate superclass in that is not the superclass against which the leaf class was compiled just seems wrong to me.

Posted by Admin on 09-Jul-2010 13:19

tamhas schrieb:

I'm with Julian here.  I could imagine 300KB, but not 300MB.

Got me :-)

First, if there is a class hierarchy of A->B->C, whatever size each of those might be, using the current approach all three are going to be loaded into memory anyway.  That can't possibly be any smaller than if a combined R-code is created for A that includes all three and it is likely to be smaller if any optimization is done.

That's true if there is only one leaf class loaded. But when an instance of a second leaf class of the same base class is loaded there is a huge beneit as the runtime already does have features to optimize loading the same R-Code (of the base class) - think of the effect of -q etc.. So in my case there are 300 kB less loaded from disk and required in memory while in your case 2 * 300 kB would have to get into and remain in memory.

Multiply that with any number you think is realisitc, your approach can't win here. And useless execution of constructors could be avoided (by the AVM) in both situations or not.

Second, while one hopes that there will only be one copy of the R-code either way, the data portion of all three can't possibly be smaller than the combined data portion and it is involving three redirects to find all of it instead of just one.

I'm very sceptic about the first part of that statement. I doubt that the AVM would be able to identity "parts" of different R-Codes as identical and keep only one version of that part in memory. But I'm with you for the data portion. But with base classes of 300 kB the data portion of those classes is less important than the code.

Third, are you really going to define a C class inherited by 1000 sub objects with 300MB of code in it?  When people do things like define a Business Entity superclass inherited by all Business Entity classes, that super class only contains a small handful of common members ... nothing else would make sense.

My example is from the UI. And when you create classes that need to play nice in the .NET UI and offer a great design time experience in the Visual Designer you have to implement a lot of Interfaces with a single class. That creates more members than I'd wish  (and I know your opinion of the .NET world already).

Agreed that if you have 1000 business entity classes all inheriting from some BE superclass and load all 1000 of them into the same session at the same time, then you will have 1000 copies of that common code instead of one ... but how realistic is that?  I would think it unlikely that any one session would be handling more than a handful of BE types simultaneously.  I just don't see there being much likelihood for huge amounts of duplicated code.

Not all 1000 would need to be loaded at the same time. But even with a small fraction of the number of objects there is going to be a huge difference, right? See my calculation about just two above.

I just don't see how a PL really impacts this.  While I was a big fan of sticking a variant .p in front of the standard .p on a PROPATH, the idea of sticking an alternate superclass in that is not the superclass against which the leaf class was compiled just seems wrong to me.

You didn't wanna see that in the other thread on library protection level. When a library could become signed and sealed, then noone could just place a different .r file in front of the propath. It would just not be accepted by the runtime. In .NET a class knows the strong typed name of any assembly it references classes from. That means a public key token of the publisher, the exact vesion number and if you like, also the language.

An alternative might be to just store the CRC value of a base class in the leaf class.

Posted by Thomas Mercer-Hursh on 09-Jul-2010 14:19

Got me

I'm relieved.  I think 300MB is bigger than my entire ERP suite of r-code.

So in my case there are 300 kB less loaded from disk and required in  memory while in your case 2 * 300 kB would have to get into and remain  in memory.

Do you really have base classes of 300KB?   That are shared by lots of leaf classes?

I can imagine something like Order being pretty good size and there being maybe a couple of leaf classes for that, so, yes, one might have two or three leaf classes loaded when processing Orders and have some redundancy. But, I have a hard time thinking that any of these are going to be 300KB.  And, if I have something like an Business Entity superclass which is a parent to hundreds of different entities, not only do I expect it to be very small, but I only expect a handful of those to be active at any given time.

I don't dispute that this would result in somewhat higher memory usage.  I do question that the increment is likely to be anything like multiples of 300KB.

And, consider the alternative ... i.e., what we have now.  We are creating many more objects with the obvious measured overhead which that has.  And, as we reference the properties and methods of those objects, we have multiples of redirects to deal with.  Not only is that also going to use some memory, but it demonstrably impacts performance on creating new instances of leaf objects and prevents optimization.

I doubt that the AVM would be able to identity "parts" of different  R-Codes as identical and keep only one version of that part in memory

I'm not suggesting that it would identify parts.  I was merely saying that, once loaded ... a one-time operation with -q ... that there are no additional loads required, even if one is processing hundreds of instances.  I.e., load penalty is paid once per session or task, but the create penalty is paid for every instance.

My example is from the UI

Can you tell me a little about what produces this structure?  It just seems counter to my expectations.

But even with a small fraction of the number of objects there is going  to be a huge difference, right? See my calculation about just two above.

There are three separate factors here -- the actual size of the base class, the total number of children, and the number of children likely to be referenced at the same time.  As I said, I can see that something like Order might be large ... although I am still dubious about 300KB ... but it is going to have at most a few children so even if all of them are active at the same time, the multiplier is not large.  And, I can see something like a base Business Entity class which is the ultimate parent of all business entity classes, potentially hundreds, but then only a handful of thm are going to be active at one time and a common base class like that is going to be very small.  So, what I am missing here is the use case where the base class is extremely large, has a large number of children, and a large number of children are active at the same time.  I'm not claiming that it is impossible, just questioning why and when this would occur.

You didn't wanna see that in the other thread on library protection  level. When a library could become signed and sealed, then noone could  just place a different .r file in front of the propath. It would just  not be accepted by the runtime

I understand the concept for protection, whether or not I think it is needed.  I understand that such signature would remove the possibility of a PROPATH change instantiating a super class that was different from the one the leaf class was compiled against.  What I don't understand is why sticking in a different super class even works.  If it is not the same super class in exactly the same state as when the leaf class was compiled, then all those compile time type checks are meaningless.  Not only does it seem like an incredibly easy way to shoot yourself in the foot, but in the absence of something like your signed PLs it seems like an ideal way for someone to violate the security of an application.

Posted by Shelley Chase on 13-Jul-2010 13:13

The AVM knows when it is dealing with an object versus a procedure. At runtime much of the dynamic setup necessary for a procedure can be done at compile for a class.

As far as combining the r-code, I can understand both viewpoints and we considered doing it both ways. There are two major reasons we went with separate r-code files. Keeping one version of A.cls's r-code in memory is much smaller than it is to have each subclass of A have it's own copy of the r-code. I guess we could have kept identifiers of each classes' r-code and avoid loading duplicate code but that would have required a major rewrite to the r-code loader and the footprint on disk would have been much larger for an application going from procedures to classes. The second point I already mentioned is the "loose coupling" in the hierarchy for bug fixes and enhancements.

-Shelley

Posted by Thomas Mercer-Hursh on 13-Jul-2010 13:40

I appreciate that there is a tradeoff here between minimizing memory and the number of objects, but ...

  • If one is dealing with a single leaf class, the same code will be loading either way;
  • Duplication of source code only happens for cases where there are multiple leaf classes from a common super;
  • The most likely case of that is highly generic superclasses, e.g., a business entity super, and that is almost certainly going to have minimal code; and
  • The most duplication will happen with something like two or three flavors of Order, where Order itself is complex, but the duplication will be limited to just those two or three instances.

On the flip side:

  • Separate classes multiplies the number of class instances in every case, no matter how small the super;
  • Multiple class instances multiplies the performance penalty of creating new classes, no matter how small;
  • Separate classes means no potential for optimizing, e.g., cascaded supers or running code in the super;
  • Separate classes introduces risk of a super in the PROPATH that  is not the one the subclass was compiled against.

On balance, this seems like a place where the mentality of superprocedures has been allowed to infect an OO construct.  Are there any other languages where a generalization hierarchy is implemented as N separate classes?

Posted by Admin on 13-Jul-2010 13:59

infect an OO construct.  Are there any other languages where a generalization hierarchy is implemented as N separate classes?

.NET

Posted by Peter Judge on 13-Jul-2010 14:24

  • The most likely case of that is highly generic superclasses, e.g., a

business entity super, and that is almost certainly going to have minimal

code; and

I'd disagree with this: assuming your BE's are built on ProDataSets (Let's not have that discussion here, again. Please), you are going to have a non-trivial amount of code related to the management of the dataset and its operations.

Or if you're using MVP, a common Presenter class will be used plentifully and will have non-trivial amounts of functionality, too.

-- peter

Posted by Admin on 13-Jul-2010 14:34

Or if you're using MVP, a common Presenter class will be used plentifully and will have non-trivial amounts of functionality, too.

Much like my 300k super-class. So it's reallity. And this class does only control the UI side of things.

And a couple of 100 leaf classes of that is not very extreme for a large ERP system.

Posted by Admin on 13-Jul-2010 15:03

And let me add, that the security you are requesting is achieved by digitally signing assemblies (= class libraries, remember my suggestion for an enhanced role for procedure libraries) and storing the public key of a referenced classes assembly with the consumer class. And when you want to sign your assembly, you need to make sure that all referenced assemblies are signed as well.

Posted by Thomas Mercer-Hursh on 13-Jul-2010 16:12

OK, but if you have a business entity based on PDS, you only have any code duplication if that BE has subclasses which are in simultaneous use.  Note that, with the M-S-E pattern, there are no such subclasses since the role is filled by Decorator.

Likewise with Presenter ... do you have two or more leaf classes active in the same session?

Posted by Thomas Mercer-Hursh on 13-Jul-2010 16:13

If true, another reason not to be using .NET as a model for anything.

Posted by Thomas Mercer-Hursh on 13-Jul-2010 16:16

I'm not actually the one requesting the security, but my point was that the idea that one can do that now is bizzarre.  Adding such security to the language may or may not be a useful thing to do ... different topic of discussion ... but it doesn't exist now and, even if added, would have to be enforced; it wouldn't be automatic.

Posted by Admin on 13-Jul-2010 23:09

I'm not actually the one requesting the security, but my point was that the idea that one can do that now is bizzarre.

I must be misunderstanding you. I have the feeling that you are criticizing the current possibilities. Wouldn't it be a logical consequence, to ask for a change in that area?

Posted by Admin on 13-Jul-2010 23:14

If true, another reason not to be using .NET as a model of anything.

Does Java compile base classes into every leaf class? That's hard to believe.

Posted by Admin on 13-Jul-2010 23:14

Likewise with Presenter ... do you have two or more leaf classes active in the same session?

Absolutely. Users like to work that way and have many screens (instances of the same screen and different screens) open.

Moving from a single window app (non persistent programs) to a multi window app (potentially MDI) is a common user request when modernizing the UI.

Posted by bsgruenba on 13-Jul-2010 23:23

mikefe wrote:

If true, another reason not to be using .NET as a model of anything.

Does Java compile base classes into every leaf class? That's hard to believe.

As I understand Java, .NET and C++, all of them instantiate the base class(es) for each instance, but they do share code segments. So the overhead is really small.

Thomas: Perpetually making the statement that Java/.NET/C++ are not models to be followed for anything is one of the most ridiculous arguments I have heard, especially as you obviously have no experience with any of these languages. If you spoke from a position that indicated that you have some kind of experience on the subject, I think you'd find that we would be a little more receptive to your arguments. As it stands here, though, this is nothing more than ignorant rant.

Posted by Admin on 13-Jul-2010 23:29

As I understand Java, .NET and C++, all of them instantiate the base class(es) for each instance, but they do share code segments. So the overhead is really small.

So the ABL's implementation is in fact industry standard.

Posted by bsgruenba on 13-Jul-2010 23:42

mikefe wrote:

Likewise with Presenter ... do you have two or more leaf classes active in the same session?

Absolutely. Users like to work that way and have many screens (instances of the same screen and different screens) open.

Moving from a single window app (non persistent programs) to a multi window app (potentially MDI) is a common user request when modernizing the UI.

In any of the other OO languages, each column of each row in a table is an instance in its own right that contains numerous other objects that are normally extensions of several interfaces/base classes. In a table that has 30 columns in where you fetch 40 rows in a batch, you are likely to be instantiating at least 1200 objects (30 column objects multiplied by 40 collections of columns - rows). When you consider that the collection (row object) is a pretty heavyweight object in its own right, you're probably talking closer to 4000 objects for a JDBC call to the database.

Now... Those classes all inherit from substantial base classes and there is a lot of code in those bases. As Java externalizes all string resources, it's hard to draw a direct comparison with OpenEdge, but class files of 30/40k are commonplace and base classes that compile to 100+K are not entirely unusual.

So I guess what Mike is asking for here is not unusual and when the objects are user interface components, there is no doubt that they can get very big very quickly. And UI is probably the most useful place to have object-orientation.

Posted by bsgruenba on 13-Jul-2010 23:49

Yeah, but as Thomas point out, it's a bad standard

I believe (if my history is right) that C++ is 30 years old. It has become the base for numerous other languages and it has been reasonably successful at completely revolutionizing computing, and yet we are to take it as a bad model? I don't think so.

Posted by Thomas Mercer-Hursh on 14-Jul-2010 11:40

I am requesting a change ... to compilation into a single object.  Security additions may well have their use, but they aren't going to be there by default and I wouldn't want to have to apply security to every generalization hierarchy.

Like some of the other areas of discussion, I think there is an element here of guide versus enforce.  It isn't that I want to enforce that the superclasses loaded are the same ones used for compilation as that I would want to help people from making the mistake of loading alternate versions.

Posted by Thomas Mercer-Hursh on 14-Jul-2010 11:44

OK, but how much bloat are we really talking about?  If there is one large parent with a couple of leaf classes in the same session, we have the bloat of the size of the parent for each leaf class from 2 up, but only for that parent class.  Are you going to have 10s or hundreds of such classes in a single session?

Posted by Thomas Mercer-Hursh on 14-Jul-2010 11:50

but they do share code segments. So the overhead is really small.

How is this different from what is happening in ABL?  Or is it?  Is it just that the cost of instantiating a class is so much higher in ABL?

Posted by Thomas Mercer-Hursh on 14-Jul-2010 12:00

C++, Java, and C# all have aspects which provide good models and aspects which provide bad models.  Mostly, it is what people do with these languages where the bad models come.

It seems to me the real point here is not whether ABL is doing the same thing as is done in 3GL OOs, but what are the tradeoffs.  I listed what I thought were the tradeoffs quire some time back.  We have had some additional testimony to suggest that perhaps the problem of duplication is more severe than I would have expected.  That is a new data point.  Fine.  It doesn't, however, do anything to address the topic of this thread, which is the substantial performance penalty one is encountering from using inheritance.  We probably need some additional tests to see whether this is only a penalty at the time of NEW, in which case it may be less significant, or whether the context switching also impacts the execution.

Posted by bsgruenba on 14-Jul-2010 12:40

tamhas wrote:

but they do share code segments. So the overhead is really small.

How is this different from what is happening in ABL?  Or is it?  Is it just that the cost of instantiating a class is so much higher in ABL?

I believe, and now I am really going by what I have heard rather than what I can vouch for, that in the other 3 languages, a single, aggregated data segment is created for an instance of an object. So, if A is extended by B which is extended by C, there is one code segment that is shared for each instance of any object and that each instance of C will have an aggregated data segment that includes the data segment portions of A and B.

In the ABL, I believe that although they share one code segment, each instance of C will result in A and B instances of a data segment.

As I said, this is based on something that I read a while ago that I could not point you at right now so it would require verification, and may well have been misunderstood.

Posted by Thomas Mercer-Hursh on 14-Jul-2010 12:51

So, if your memory is correct, there is a one time per class cost to set up the code segment for each class in the hierarchy, but a one time per object cost for each leaf class to set up the data segment.  Whereas, in ABL, there is a one time per class cost to set up the code segment for each class in the hierarchy, but a one time per object cost for each object in the hierarchy to set up the data segment.

I would think that a test like the one that initiated this thread, this would remove virtually all of the performance overhead of inheritance.

And, it would mean that ABL is *not* following the model of these languages.

Posted by Admin on 14-Jul-2010 12:54

OK, but how much bloat are we really talking about?  If there is one large parent with a couple of leaf classes in the same session, we have the bloat of the size of the parent for each leaf class from 2 up, but only for that parent class.  Are you going to have 10s or hundreds of such classes in a single session?

In parallel - at the same time, maybe a couple of 10s. Over the course of a client session (9:00-17:00) that may be a lot more or the same amount. Depending on the users role.

But don't forget deployment! A huge ERP package can easily fit a couple of hundreds, easy more than 1000 of these UI objecs.

Remember that ADM1 app I've told (700 MB R-Code). That has 915 objects that are similar (from the users point of view) to what I'm dealing with here. Do you want to deploy 300 MB extra (and this time I'm talking really about 300 MB) because of a performance penalty that Progress has already indicated that they are going to address that? And for the sake of not being able to replace a base class in the Propath with something evil, you could store a time-stamp or crc of the base in the leaf. Or go the library route. All results in far less R-Code that has to go from disk to memory and thus better performance and deployment advantages!

Posted by bsgruenba on 14-Jul-2010 13:09

A basic datagrid control with customizations in it can rack up 300K of ABL in no time at all, and you may have 15 or 20 instances (or extended instances) of that control on the screen at any point in time. A decent Order-Entry UI could easily trigger this.

Posted by Thomas Mercer-Hursh on 14-Jul-2010 13:37

Out of the 300KB, how much is data and how much is R-code?

And, assuming that is r-code, let me see if I have your numbers correctly.  You are suggesting that it will be typical rather than exceptional to have 20-30+ simultaneous cases in which there are 300KB superclasses in which there are two or more leaf classes active at the same time?  Or, might it be more correct to say that our of N classes active in a session, there are 20-30+ instances where the superclass is 300KB in r-code and there is an additional leaf class?

Posted by Thomas Mercer-Hursh on 14-Jul-2010 13:42

OK, but it isn't just that there are 15-20 instances.  If there are 15-20 instances of unique classes then all of the code for those classes has to be in memory anyway.  What we need here is 15-20 instances in excess of this minimum base code.  I.e., 20 different datagrids all derived from the same 300KB superclass.  Or 40 different instances of 20 different hierarchies with 2 leaf classes per hierarchy and a 300KB superclass in each hierarchy.

Is that what you are suggesting?

Posted by Admin on 14-Jul-2010 13:44

I was talking about a couple of 10's (30 may be a better guess than 20) of leaf classes of the same base class that are simultaneous in memory. At the same time there will be a similar number of instances of different leaf classes of orher base classes.

Bruce was talking about something else, I guess.

Both cases are relevant and real world.

Posted by Thomas Mercer-Hursh on 14-Jul-2010 15:52

So, let's bring this back to the top and see if we can't pull together a consensus.

In re the code, it does appear that there may be use cases in which compiling to a single module would be disadvantageous, so let's say that the overall current structure is OK, but that it would be nice to see some optimization, if possible to avoid doing things like executing three empty constructors and to minimize the context switching penalty of executing code in the supers.  Let's make the security issue a separate topic with broader implications.

In re the data, the appearance is that initializing three separate data segments is slow and thus it would be nice to move to integrating the data segment across the hierarchy so that there was only one instantiation for each leaf class.

OK?

Posted by Admin on 14-Jul-2010 16:19

In re the data, the appearance is that initializing three separate data segments is slow and thus it would be nice to move to integrating the data segment across the hierarchy so that there was only one instantiation for each leaf class.

As long as that's not causing trouble of any kind when there are different private data members with the same name defined in a base and a leaf class, that sounds like a possible optimization.

But as long as you don't prevent to replace the super class with a patches version, you'll have to parse the data members of both classes. You can't trust that the r-code of the leaf knows anything about the (private) data members of the super class: there might be a new one introduced in the patch...

But I don't know enough about whats going on when instantiating a class and it's supers to fully judge here.

Posted by Thomas Mercer-Hursh on 14-Jul-2010 16:29

As far as mapping a variable in one class to the data segment, I'm sure that is just a pointer anyway, so name conflicts should not be an issue.

As for the rest, yes, there are things to resolve so that the task is not trivial, but if Bruce's memory is correct, it is a problem solved by many other languages, so presumably PSC can solve them too.

Posted by bsgruenba on 14-Jul-2010 16:40

tamhas wrote:

As far as mapping a variable in one class to the data segment, I'm sure that is just a pointer anyway, so name conflicts should not be an issue.

As for the rest, yes, there are things to resolve so that the task is not trivial, but if Bruce's memory is correct, it is a problem solved by many other languages, so presumably PSC can solve them too.

You presume too much. As I have said before, OpenEdge OO is based on the Super Procedure model which is effectively runtime-determined inheritence. I know that some work has been done to make this a more static mechanism in OO, but fundamentally it is still based on dynamic inheritence.

Given that, it is going to be a non-trivial task to make this work differently, and frankly I don't believe that a change as fundamental as this is likely to happen.

Posted by Thomas Mercer-Hursh on 14-Jul-2010 16:51

I don't think it hurts to ask for what we want.

You got any better ideas?

Posted by bsgruenba on 14-Jul-2010 17:25

tamhas wrote:

I don't think it hurts to ask for what we want.

You got any better ideas?

Yeah. I do, actually.

At the heart of the reason that you are raising this issue is the fact that there is a direct performance impact on buildin/working with collections. Any time we use the 4GL to do what is effectively system programming, we are using the 4GL beyond its intended purpose. You needed to instantiate several thousand classes to see this performance issue, and the truth is that 4GL was never designed to do what you are trying to do.

So the right solution is to get PSC to realize that they are the ones who are encouraging us to bastardize the 4GL into a systems programming language because they will not address the fundamental issues that we are trying to work around - availability of collections in the core language.

The problem with OO in the 4GL is exactly the same issue that caused the death of a project called FutureShock that Pete Sliwkowski and a couple of others worked on back in the early 90's. The 4GL simply cannot meet the performance requirements that are needed to be able to do the kinds of things that we are trying to do. It's not designed to. It's doing a whole lot of other stuff behind the scenes to make business logic development simple.

The difference between now and then is that the 4GL is running on much faster equipment so that the performance hit is not as bad as it was then - we've thrown hardware at a software problem. But when you shine a light on the software problem by trying to build systems-level functionality in the 4GL, and then exacerbate it by performing high cost processes is in highly iterative loops, the problem really does explode in front of your face.

This is one of those, and accepting the 4GL's limitations and working within those limitations is not only pragmatic, but also very important. For collections, the cases that I need to satisfy seldom exceed 100 objects, and if they do, I am going to use a temp-table instead. I can deal with the performance overhead of iterating through 100 objects, especially if it is business logic that is not really that processor intensive.

It's all a case of using horses for courses.

Posted by Thomas Mercer-Hursh on 14-Jul-2010 17:50

Except that this thread has nothing to do with collections, per se.  The performance impact here is from creating objects which are part of a generalization hierarchy.  True, if one is only creating 10 of them, that overhead is probably immaterial.  But, if the hierarchy is five deep (I hope not) then your 100 leaf objects is going to mean paying for 500 object instantiations whether you keep them in a collection class, temp-table, array, or whatever.

Posted by Admin on 15-Jul-2010 02:14

So the right solution is to get PSC to realize that they are the ones who are encouraging us to bastardize the 4GL into a systems programming language because they will not address the fundamental issues that we are trying to work around - availability of collections in the core language.

Where's the "like" button on PSDN?

First an untyped collection, then generic collections. Keyed and unkeyed, please.

Posted by bheavican on 15-Jul-2010 08:48

I'm a newbie to a lot of this stuff and we have experienced terrible performance hits with our GUI for .NET Forms, at least those with a lot of controls.  My question is if there is a performance hit on the classes that you create that inherit other classes is that same performance hit in classes/objects that we don't have control of?  For example the Infragistics UltraGrid instantiation?

Posted by Admin on 15-Jul-2010 09:29

I'm a newbie to a lot of this stuff and we have experienced terrible performance hits with our GUI for .NET Forms, at least those with a lot of controls.  My question is if there is a performance hit on the classes that you create that inherit other classes is that same performance hit in classes/objects that we don't have control of?  For example the Infragistics UltraGrid instantiation?

When it get's better after loading the same Controls the second time in the same session it's the loading of the huge Infragistics assemblies. That's a one time performance hit.

Posted by Thomas Mercer-Hursh on 15-Jul-2010 11:42

Also, note that the .NET controls are being instantiated by the CLR, not the AVM, so whatever happens there is what happens to any .NET class being instantiated.  Of course, I suppose there is a potential for some overhead in creating links across the bridge, but I don't know if that has been tested.

Posted by bsgruenba on 15-Jul-2010 14:57

tamhas wrote:

Also, note that the .NET controls are being instantiated by the CLR, not the AVM, so whatever happens there is what happens to any .NET class being instantiated.  Of course, I suppose there is a potential for some overhead in creating links across the bridge, but I don't know if that has been tested.

I don't know this for a fact, but I would not be surprised to learn that the ABL-CLR bridge actually introduces a performance penalty here.

Although all UI updates have to take place on the UI thread in the CLR, in C# controls and their libraries can be instantiated on background threads, as long as the UI updates happen on the UI thread.

OpenEdge has to respond to the events on the UI thread and therefore could impose restrictions on how quickly the next line of ABL code gets executed as controls are loaded.

What this means is that code that is executed on the UI thread in OpenEdge could be running significantly more slowly than the equivalent code would run if it were built in .NET only.

It would be very interesting to test this out and prove it to find what the real penalty is. I have long suspected that it may, in fact, be faster at run time to build your UI as a .NET UserControl and then host it on an OpenEdge form, rather than instantiate each control in OpenEdge and have OpenEdge handle all the interaction.

Posted by Matt Baker on 15-Jul-2010 15:07

Although
all UI updates have to take place on the UI thread in the CLR, in C#
controls and their libraries can be instantiated on background threads,
as long as the UI updates happen on the UI thread.

This is not true under certain circumstances.  Trust me. I have had to debug some of the .NET UI code.  If anyone tells you different they don't believe them. In most cases .NET will delay the instantiation of the actual window handle until the first time it is needed.  This is the equivalent of the ABL issue of calling window:visible = true.  You can make changes to the ABL object until that point, but certain changes are disallowed afterwards.  Once the UI needs to be updated in .NET the CLR will allocate a handle and bind the handle to an execution context which references the thread it was created on.  The execution context dictates what thread the widget gets painted on.

If you create a widget in a background thread, you run the risk of receiving a wrong execution context which means it gets bound to the wrong thread.  this "non-UI" thread is not the thread that has the message pump on it.  You won't run into immediate problems, but it WILL bite you eventually.  If you do this and then one of the desktop events like a bit depth change, screen size change, or screen saver activation occurs, then your UI WILL hang.

Don't ever think you can get away with this safely.

Posted by Thomas Mercer-Hursh on 15-Jul-2010 15:14

Wouldn't the bridge only be a factor in slowing the *instantiation* if there was activity in the constructor which went across the bridge?  To be sure, I can see an impact between instantiation and *ready for action* because of traffic across the bridge.

Posted by bsgruenba on 15-Jul-2010 15:38

Matt,

I think you misread what I was saying. You are absolutely correct that you cannot update a control in any way except through the UI thread, and in fact, in anything after .NET 2.0, the CLR will trap an attempt to do so and throw a runtime exception. I was not saying that controls can update the UI or invoke UI events on anything but the UI thread.

However, the term "control" is really overloaded, because controls can have both a visual and a non-visual component. Part of instantiating a control involves building a whole bunch of context around the control, much of which has absolutely no UI implication. Data controls, particularly, are completely non-ui controls, yet they are still controls.Many of them do a lot of work on non-ui threads to retrieve data, etc.

If you want an example of code that actually does this, take a look at the code on my blog at this URL:

http://www.thesoftwaregorilla.com/2010/06/openedge-gui-for-net-testing-the-bridge/

Inside a control, you can delegate a lot of work to background threads, and as long as those controls invoke all their events on the UI thread, you are perfectly safe to use them. That, in fact, is exactly what the code in that example illustrates.

Now the same is true of controls that have large numbers of objects, like the Infragistics controls. They do a lot of work in the background on background threads and then update the UI and invoke events on the UI thread. Many of these controls rely on asynchronous callbacks to delegates that then post the events (including UI updates) on the UI thread.

The problem is that each time these events come back, they have to make their way across the bridge to the ABL so that the ABL can react to them, and this, necessarily, imposes the overhead of a call being marshaled across the bridge and invoking the AVM.

When a form is being built, the contents of the form are loaded in a single-threaded set of ABL calls across the bridge to the CLR and any callbacks that result in events in the AVM will will be deferred until the form has been completely built. While this is no different from the behavior in .NET, the additional overhead of the calls across the bridge has to have a performance impact.

Regards,

Bruce

Posted by bsgruenba on 15-Jul-2010 16:03

tamhas wrote:

Wouldn't the bridge only be a factor in slowing the *instantiation* if there was activity in the constructor which went across the bridge?  To be sure, I can see an impact between instantiation and *ready for action* because of traffic across the bridge.

That's my point. Every line of code associated with instantiating a form happens as a result of a call from the constructor that initializes the controls on the form (I don't have the ABL in front of me right now, but I believe the method is called InitializeObject().

InitializeObject() contains code that instantiates each control, sets its properties, associates event handlers with events, and displays the form on the screen. Each of those lines of code results in a call across the bridge to the CLR to perform that function on a control. If any of those controls have background threads that perform any setup work that requires a response from the AVM (eg, an event handler exists on control load), the event will be deferred until all the rest of the setup has been done on the form.

Posted by Matt Baker on 16-Jul-2010 07:49

Although all UI updates have to take place on the UI thread in the CLR, 
in C# controls and their libraries can be instantiated on background
threads, as long as the UI updates happen on the UI thread.

My comment was about creating them on the background thread.  Don't do this. It wasn't about updating them.

Yes, .NET does have checks in place to prevent you from causing screen updatets on non-UI threads, but those checks are based on the execution context which stores a reference to the handle of the thread which was active when the UI handle is created.   The execution context's thread is checked against the current thread and the appropriate exception is thrown (or not).

Creating a *UI Control* (anything that allocates UI resources), *sometimes* creates that handle during the execution of the constructor.  *Normally* the handle creation is delayed until after the construtor, but not always.  The handle and the setting of the execution context is created when it is first needed.  This was a performance enhancement (read the .net source code for comments) by Microsoft for those cases where a control is created, but never visualized.  .NET buttons and tab controls do this, and some of the Infragistics controls.  You have no idea of knowing which ones.

So my point is that *creating* controls on a background thread is a bad idea, because when you get around to updating them (e.g. calling BeginInvoke() or similar from a background thread) you may cause a hang.

If the control has no UI, then this is a non-issue.

This thread is closed