In search of collections

Posted by Thomas Mercer-Hursh on 05-Oct-2009 17:09

In some recent discussions, I have encountered a strong belief in some people that ABL should have collections.  I'm not yet convinced myself that temp-tables don't provide us with the tools we need, but I thought it might be worth throwing out the question to see what thoughts there might be.

The main impetus behind wanting collections seems to be to have a lightweight container for holding some set of objects.  "Collection" can actually mean several different things, in particular with reference to Java, collection is the most basic class that allows one to hold a set of objects without even a sort order and no assumptions about uniqueness, but it is also used as the generic term covering the whole range of collection and map class hierarchies, which can include key fields, sorting, uniqueness, and other properties.  See my discussion here http://www.oehive.org/CollectionClasses

So, the first question might be, what is it that people actually want?

The key characteristic would seem to be the lightweight part, i.e., not getting involved in the issues associated with the Too Many Temp-Table problem, and possibly a sense that one is somehow paying a performance price by using a temp-table in conditions where one doesn't need all the features which a temp-table provides.  Personally, I am dubious about there being a measurable performance penalty and I think there are design patterns which avoid the TMTT problem, which is why I am skeptical about the need for collections.

It should be noted that Java collections and map classes are really just classes written in Java, not language intrinsics.  They are simply a part of the generic library that is available so they can be treated like a universally available facility.  This, raises the question of why one doesn't just write a set of collection and map classes for ABL.  Well, I did that back in 10.1A (see the link above) and, other than a certain lack of elegance due to the lack of generics in the ABL, they do the job.  But, they use temp-tables in the implementation, so this doesn't really address the issue of those people wanting something more lightweight.

There seems to be an assumption that collections should be added as a built-in of the language.  This, of course, raises the question of which behaviors out of the overall collection and map classes it would implement.  It seems to me that by the time one is in the category of SortedMap, i.e., key-object pairs in a particular sort order, then one is pretty much in the territory of temp-table, so let's focus specifically on things in the Java collection inheritance hierarchy, i.e., just objects, not key-object pairs.  Let's also assume that uniqueness is something one could provide with a wrapper and that all that is necessary is that we can add objects in a desired order and get them back in the same order.  If we had that much, it seems we could provide an alternate sort order with a wrapper as well.  I.e., let's assume that the primary requirement is simply for a set of objects and the set is small enough that we can traverse the whole set if necessary, although a facility to find a particular object or the Nth object seem like they could be pluses.

So, one question we might have is whether we already have something in the language which could be enhanced a little to provide this functionality without a large amount of work?

One idea that comes to mind is arrays.  The problem with arrays, of course, is that they are of fixed size.  We now have arrays which don't have a predefined size where we can fix the size at the time we are ready to use it, but we don't have the ability to change the size once it has been realized.  Of course, this is essential if we are going to add new members to the collection, unless we are willing to define some upperbound number as the maximum count of members in a collection, a very tacky idea.  If we could gain the ability to resize an array, it could fill this need nicely since there is a natural sequence based on the element number and it would be easy to extend that in various ways using companion arrays.  Ideally, one would also gain the ability to make block assignments, i.e., to assign elements 5 to the end to elements 6 to the end in order to create an empty slot between element 4 and the old element 5.  But, for smaller sets at least, that could be imitated in various ways.  Of course, I have no idea what arrays look like under the skin so I have no idea how easy it would be to make them dynamically resizable.

The other idea I had is work-tables ... maligned legacy things that they may be.  They provide an open-ended collection of elements with a sort order based on order of creation, but with a mechanism for inserting records at any location, including the front.  And, by defining two fields, one for the object and one for a key, one would also cover maps, albeit without automatic sorting by the key.  This sounds ideal, except for one hitch ... one can't currently define a field in a work-table as being a class, not even Progress.Lang.Object.  But, how hard could that be.  Seems likely that the main reason it isn't already done is that work-tables are deprecated.  But, possibly they could be rehabilitated for this purpose.

Other ideas?  Other requirements?  Feedback from PSC on how difficult?

All Replies

Posted by Matt Baker on 05-Oct-2009 20:38

One idea that comes to mind is arrays.  The problem with arrays, of
course, is that they are of fixed size.  We now have arrays which don't
have a predefined size where we can fix the size at the time we are
ready to use it, but we don't have the ability to change the size once
it has been realized.  Of course, this is essential if we are going to
add new members to the collection, unless we are willing to define some
upperbound number as the maximum count of members in a collection, a
very tacky idea.  If we could gain the ability to resize an array, it
could fill this need nicely since there is a natural sequence based on
the element number and it would be easy to extend that in various ways
using companion arrays.  Ideally, one would also gain the ability to
make block assignments, i.e., to assign elements 5 to the end to
elements 6 to the end in order to create an empty slot between element
4 and the old element 5.  But, for smaller sets at least, that could be
imitated in various ways.  Of course, I have no idea what arrays look
like under the skin so I have no idea how easy it would be to make them
dynamically resizable.

What you have described here is commonly referred to as a "vector", not an array.  Arrays are never resizable in either java or c#.  For arrays, you can certainly allocate a new object that the variable points to, but you cannot dynamically extend the object.  The Vector class in java lets you do this, but it comes with the cost of having to copy the elements from the old array to the new array.  Several of the common collection classes such as ArrayList and Vector are backed by arrays which are built into the language.  The important difference here is that java and .net both have fast array copy implementations (in java this is a native call) which makes redefining them with new elements very cheap.  Assign the old array to a new variable, new an array with a bigger size, copy all the existing elements, and then replace the original array reference with the new one.  4 lines of code in java.  In ABL you could do the same thing with about 8 lines of code.  You can extend this to vector by checking what element you want to assign, if the array is too small, then resize it.  The other important difference is autoboxing which allows you to have a single implementation of Vector and you can stuff any object or scalar into it.  Generics on collection classes are nice, but not necessary since they just enforce the type checking.

Posted by Admin on 05-Oct-2009 23:45

Besides the various use cases for collections there is the aspect of the biggest possible generalization with something as generic as Collections (a list of objects with dynamic length) .

Not every new OOABL code will either look at your stuff on oehive or the collections we have developed or somebody elses implementation. So various people might end up with various different yet very similar collection implementations - and if they just differ in the type name. They won't implemenet all the same Interfaces, some might (on the 10.2A client) inherit from .NET Collection classes, some built around arrays or temp-tables.

In the end YourColletion of System.Windows.Forms.Control objects and MyCollection of System.Windows.Forms.Control objects will not be type compatible. The result will be that it might be hard to use code from different parties in a single application. You might need Collection converters - uuuh. I am aware and I've faced this often already that it happens so often that two people build similar things. The question is if a Collection (or different collection) should be seen as something that is part of the language or part of the application. I believe they should be part of the language.

Having done a lot C# coding and looking at open source C# sample code so regularly I believe that besides the quality of the implementation (mostly the performance questions) the generalization part makes life really easy. And I'm sure that only Progress will be able to reach every ABL developer with a base Collection class.

Having started OO coding with the full, overwhelming class library of the .NET framework on the table Collections seems like a very basic missing piece in the current OOABL.

Posted by Peter Judge on 06-Oct-2009 08:34

collections.

One of the Beta enhancements from here

In my experience, OEA just works better with OO code. It's also easier for me to work with code that's all the same paradigm (ie all OO or all proc); 'suddenly' having to work with a FOR EACH instead of a method call seems 'weird' (I mean in context, I love the ease of a FOR EACH).

In addition, the pseudo-code below doesn't care whether oMyColl is an object in which the collection is a temp-table, work-table, array, comma-delimited list (haha) or whatever the physical implementation is. It will just work with any/all of those, which makes for more robust code.

oEnum = cast(oMyColl, ICollection):GetEnumerator().

do while oEnum:MoveNext():

oItem = oEnum:Current.

/* magic happens */

end.

So I do think there's a need for collections outside of the TMTT issue. Clearly, the underlying data structures must perform well and be efficient in terms of resource usage too, but it's easy to let the performance issues sidetrack the other benefits.

To Mike's point about PSC providing something, I can see the provision of an interface much like Progress.Lang.Error being a great starting point. It would allow app devs to code to that interface while writing their own collection classes and also allow for the possibility of built-ins at some stage.

-- peter

Posted by Thomas Mercer-Hursh on 06-Oct-2009 11:06

Arrays are never resizable in either java or c#

But then, they have collections ... which is where I am actually trying to go here.  Using an ABL array is a crutch.

The need to copy is clearly one of the downsides of using an array.

Posted by Thomas Mercer-Hursh on 06-Oct-2009 11:12

So various people might end up with various different yet very similar collection implementations

I suppose there is some risk of this, collections being more likely to be common than other types of classes, but it isn't a big worry.  Were people to start getting serious about open source code sharing, we could probably agree on a standard set of collection classes, but we would still be left with temp-tables on the interior.

I believe they should be part of the language.

I'm not sure I do ... they aren't in other languages.  In any case, I am not optimistic at this point of convincing PSC to implement something new, so I am wondering if there is something they can do easily which will get us what we need.

Posted by Admin on 06-Oct-2009 11:49

we could probably agree on a standard set of collection classes

We (those active in forums like this) could do so - but the least common delimiter of all ABL users is the language. I'm pretty sure that we all (once we did agree on the open source standard collection implementation) will be facing collection implementations in projects that have been implemented by developers that haven't been aware of the availability of the open source standard collection implementation and cooked their own (like with OERA implementations). And then? Then will be back to collection converters, copying values from A to B which won't make it any better than copying array elements from A to B to resize an array.

 

I believe they should be part of the language.

I'm not sure I do ... they aren't in other languages. 

I have a C# background. I know that the System.Collection classes are not part of the language per se. They are part of the framework class library. But they are available to every developer. That makes it equivalent to being part of the language (for my understanding).

Posted by Thomas Mercer-Hursh on 06-Oct-2009 12:03

Actually, it would equivalent to PSC getting behind the project to create and disseminate framework components.  But, based on the history, I'm not sure that is the best way to go.  Better, I think, would be an open source effort which PSC endorsed, contributed to as individuals, and pointed to in an official way so that it was noticed.

But, either way, that is actually off topic to this thread. Wherever the implementation comes from, today it is going to have temp-tables and there is a concern that it too "heavy" for many common cases. It is clear that one option is for PSC to decide to implement lightweight as a language feature, but who knows when that might happen.  What I am wondering about here is whether there is something already in the language that would take a developer a very short amount of time to turn into an effective mechanism for collections.  We might still want to bury this mechanism in general purpose classes with standard interfaces, but first we need the mechanism.

I would think enabling objects as columns in a temp-table should be pretty easy ... but what do I know?

Posted by Admin on 06-Oct-2009 12:24

I would think enabling objects as columns in a temp-table should be

pretty easy ... but what do I know?

Are you asking for strongly typed columns? Put me on the list of

requesters as well!

Posted by Thomas Mercer-Hursh on 06-Oct-2009 12:37

Yes, I would like to get objects of a particular type in a temp-table to avoid having to do the cast ... being able to define an index on a property of that class would be really cool!

But, no, in the context of this thread what I am referring to is being able to define a *work-table* aka workfile with a column of datatype Progress.Lang.Object.  That would give us something very much like a collection with minimum footprint.

Posted by guilmori on 06-Oct-2009 14:58

tamhas a écrit:

Yes, I would like to get objects of a particular type in a temp-table to avoid having to do the cast ... being able to define an index on a property of that class would be really cool!

This would be great.

But we would still be stuck with a major annoyance of temp-table; not being able to scope an "instance" of a temp-table to a method.

Moreover, temp-tables are no real object, so they must be treated differently in many places, which reduce the "ease of use", and also cannot be generalized as a progress.lang.object.

What I would prefer is the OOABL to provide an implementation of some kind of ICollection generic interface, which could be queryied using a FOR EACH or QUERY.

Now I'm really out of scope of this thread

Posted by guilmori on 06-Oct-2009 15:05

tamhas a écrit:

Actually, it would equivalent to PSC getting behind the project to create and disseminate framework components.  But, based on the history, I'm not sure that is the best way to go.  Better, I think, would be an open source effort which PSC endorsed, contributed to as individuals, and pointed to in an official way so that it was noticed.

I do not agree. I think that extensions to the language made as built-in classes by PSC is the way to go in the future. And those built-in classes could easily be extented by an open source effort.

Posted by Thomas Mercer-Hursh on 06-Oct-2009 15:21

What is keeping you from creating and deleting a temp-table within a method ... especially if you wrap it in an object?

Moreover, temp-tables are no real object

No, but they can certainly be encapsulated in objects.

What I would prefer is the OOABL to provide an implementation of some kind of ICollection generic interface, which could be queryied using a FOR EACH or QUERY.

I think I have a pattern which handles this, which I am calling SuperMap, an extension of the Java Map class concept.  More on this later since I don't want to hijack my own thread.

Posted by Thomas Mercer-Hursh on 06-Oct-2009 15:29

I do not agree.

Everyone is free to do so, but you do have to admit that collection and map classes in Java are just that, classes, someone wrote them once.  If one wants something that is almost, but not quite, one is free to subclass or even just write something new.  No magic.

If, however, PSC puts it in the language, then we are going to have certain flavors and not others.  Will they implement everything you want?  Maybe, because they have been doing a pretty good job of paying attention to principles ... other than giving me generics ... but it is chancy.  Suppose they do Collections and not Maps (in Java terms).  Helps, but doesn't cover all the bases.

Frankly, I think there is way too much in the language.  We are stuck with it now, but think how much more elegant things could have been if OO was introduced with V7 and the GUI was a class library instead of part of the language.

I think the key missing piece here, if any, is a language feature which will be lightweight at the expense of having less functionality than a temp-table.

Posted by guilmori on 06-Oct-2009 16:47

tamhas wrote:

What is keeping you from creating and deleting a temp-table within a method ... especially if you wrap it in an object?

Moreover, temp-tables are no real object

No, but they can certainly be encapsulated in objects.

You then add a wrapper overhead to the already present temp-table overhead, we are getting further away from a lightweight container. Not forgetting that you have no more direct access to the temp-table.

I do think we need some basic built-in collection classes with a more performant internal data structure, especially for simple set of data where no advanced navigation or query operations is required. Let's take for example a collection of string values, doesn't a temp-table seems overkill ?

Posted by guilmori on 06-Oct-2009 17:03

tamhas wrote:

The key characteristic would seem to be the lightweight part, i.e., not getting involved in the issues associated with the Too Many Temp-Table problem, and possibly a sense that one is somehow paying a performance price by using a temp-table in conditions where one doesn't need all the features which a temp-table provides.  Personally, I am dubious about there being a measurable performance penalty and I think there are design patterns which avoid the TMTT problem, which is why I am skeptical about the need for collections.

I still have to see concrete examples of your "avoiding TMTT" patterns, but for me, this means coding around a problem, ie: patching.

Either the temp-table is not the right tool, or the TMTT problem must be fixed.

And i've been told by tech support that the TMTT problem is expected behavior...

Posted by Thomas Mercer-Hursh on 06-Oct-2009 17:28

A revised whitepaper will be forthcoming soon followed by some sample code.

TMTT *is* expected behavior in that it seemingly never occurred to anyone that someone would have an actual use of thousands of TT at the same time.  Instances we know of in which people have actually encountered it seem to be limited to cases where large numbers were defined but not used or cases where people experimenting with OO made **everything** a TT, often of only one row.  I can't see either of these as sensible uses of TTs and the former seemed to respond significantly from tuning.

For many cases in ABL, I think the TT is the right solution, notably when there is one to many relationship between two objects and there are multiple relations that connect them.  The Java classes don't even handle this problem.  I suspect, in fact that they work just fine for a one to many relationship with only a single relation, i.e., the equivalence of a Java Map class.  In a 3GL, typically collections have no key and they are small enough that one doesn't mind traversing the whole set, either because the task inherently applies to the whole set or because it doesn't take long to find the desired instance anyway.  If this is a significant category of the needs in ABL too, then it might be worth having something other than TTs which would be lighter weight.  Even still, I am dubious that there will be any meaningful performance benefit and that using good design principles it will have no impact on TMTT either.

What is an example of a use case in which you have a 1 to many relation and don't care about order and don't care about the efficiency of accessing an individual record?

Posted by Thomas Mercer-Hursh on 06-Oct-2009 17:34

A collection of string values is one of the cases my existing Collection and Map classes handle.  OK, so it uses a temp-table, but is there any reason to care unless there is a measurable performance issue?  Do you have a demonstration of their being a measurable performance issue which isn't related to creating thousands of such collections, i.e., exceeding memory.  And, even then, did you use the available parameters to tune to see if the problem went away?

It just seems to me that one typically cares about order and very often cares about random access.  Collections provide neither except by the crude mechanism of going to the overhead of building the set in the desired order ... almost certainly more expensive than a TT with an index.

I'm not arguing against having something ... in fact, the whole purpose of this thread is to see if we can come up with something that takes minimal effort on the part of PSC.  If we had it, I would use it when appropriate.  But, I'm also not sure that I could ever measure the difference between using that or a TT.

Posted by guilmori on 06-Oct-2009 17:45

tamhas wrote:

I do not agree.

Everyone is free to do so, but you do have to admit that collection and map classes in Java are just that, classes, someone wrote them once.  If one wants something that is almost, but not quite, one is free to subclass or even just write something new.  No magic.

If, however, PSC puts it in the language, then we are going to have certain flavors and not others.  Will they implement everything you want?  Maybe, because they have been doing a pretty good job of paying attention to principles ... other than giving me generics ... but it is chancy.  Suppose they do Collections and not Maps (in Java terms).  Helps, but doesn't cover all the bases.

Frankly, I think there is way too much in the language.  We are stuck with it now, but think how much more elegant things could have been if OO was introduced with V7 and the GUI was a class library instead of part of the language.

Well, I think we are saying the same thing. I don't mean built in the language, but available as class libraries. And I would expect those classes to be written with the core language under the hood of the ABL.

I think the key missing piece here, if any, is a language feature which will be lightweight at the expense of having less functionality than a temp-table.

Yep.

And once we have that, I do still believe we need either one of the following:

- fix TMTT problem + ability for temp-table to contain other class types + ability for temp-table to index and query on public members of contained instances

- collections on which one can use existing ABL querying abilities

I prefer the second one !

Posted by Thomas Mercer-Hursh on 06-Oct-2009 18:18

Well, I think we are saying the same thing. I don't mean built in the language, but available as class libraries. And I would expect those classes to be written with the core language under the hood of the ABL.

You do realize, don't you, that you are asking for exactly what I said would have been a better choice when developing V7.  In the current language, there is no functionality which corresponds to instantiating a machine language class other than the .NET UI stuff.  So, by asking to have it implemented this way, you are not just asking for collections, but for a whole new ability to access a machine language class library.

You would be better off wishing for a new keyword ... you would have a better chance of getting it.

Of course, if they relaxed the license on the .NET so that it wasn't just about UI, then you could use a .NET collection class ... but that isn't where I would like to go.

And once we have that, I do still believe we need either one of the following:

- fix TMTT problem + ability for temp-table to contain other class types + ability for temp-table to index and query on public members of contained instances

- collections on which one can use existing ABL querying abilities

What does "fix the TMTT problem mean to you?  I know Tim Kuehn thinks it means lazy instantiation, but that doesn't do us any good here.  If one is actually going to define all those temp-tables, I don't see what it is that one could do to "fix" the problem other than to use a smaller default size and give the session more memory, both of which we can do today.

Of course, existing TTs can contain any class type ... so what you really mean is that you want to be able to define the column as being of a specific class, not just Progress.Lang.Object, i.e., you will save the cast.  Without the other features, would that buy you much?

I will shortly be publishing a pattern ... about which you already have an e-mail ... which allows for indexing on selected properties and querying on those same properties.  Of course, it isn't using the properties themselves, but rather it is using a facade for the properties.  Not perfect, but I think it gets a lot of the job done.

If we get anything in the direction of your requested querying capabilities, I would guess it far more likely to come in the form of an enhancement to temp-tables than for this additional ability to be provided to a lesser entity like collections.

Posted by Admin on 06-Oct-2009 22:14

In the current language, there is no functionality which

corresponds to instantiating a machine language class other

than the .NET UI stuff.

I have to disagree. There are classes in the Progress.Lang.* package. Those classes are not implemented in ABL - I bet! Yet I would consider them part of the ABL class library, not the language.

Posted by guilmori on 07-Oct-2009 07:43

tamhas a écrit:

A collection of string values is one of the cases my existing Collection and Map classes handle.  OK, so it uses a temp-table, but is there any reason to care unless there is a measurable performance issue?  Do you have a demonstration of their being a measurable performance issue which isn't related to creating thousands of such collections, i.e., exceeding memory.  And, even then, did you use the available parameters to tune to see if the problem went away?

I may be pessimistic, but this certainly raise a red flag to me. Having a not so optimal approach buried deeply in the infrastructure root, for something that will be widely used, will most certainly result in serious problem down the road. The AVM is a slow beast, and all those *seemingly small* things really add quickly. I prefer complain now that in the final testing phase of a project development.

It just seems to me that one typically cares about order and very often cares about random access.  Collections provide neither except by the crude mechanism of going to the overhead of building the set in the desired order ... almost certainly more expensive than a TT with an index.

What is the difference between a "TT with an index" versus "the crude mechanism of going to the overhead of building the set in the desired order" ? Isn't the TT index doing the same thing ?

In .Net, we don't have to build a set in a fixed order, we can use LINQ To Objects to sort any sets on any members, on demand.

Posted by Thomas Mercer-Hursh on 07-Oct-2009 11:20

Well, yeah, but no.  Progress.Lang.Object is more likely to be some built in behaviors of the AVM than a true object anywhere.  It is hardly a general purpose facility into which one could plug any other non-ABL class.

Posted by Thomas Mercer-Hursh on 07-Oct-2009 11:40

Having a not so optimal approach buried deeply in the infrastructure root, for something that will be widely used, will most certainly result in serious problem down the road.

Not so optimal?  I would have said that TTs were highly optimized and extremely powerful.  The question here is, is it too heavyweight for the job that actually needs to be done?  One can, after all, define a TT without an index if it really doesn't matter.  The key question here is "too heavyweight".   Is there a measureable performance benefit from using something more lightweight or a real TMTT problem if one follows best practice?

I prefer complain now that in the final testing phase of a project development.

I prefer testing and measuring to complaining in anticipation.  Back in the days of my connection to Forté, there was a delightful paper at one of the conferences that talked about thinking about systems from the very beginning in terms of where possible bottlenecks or other performance issues might be.  E.g., if the design called for messages to flow from A to B and rough calculations showed that there would be upwards of a million messages per hour, then they would write a minimal little test harnass that sent a million messages an hour ... just "hello", no actual work being done in response to the message, just to see if it was possible to send that volume and keep ahead of the flow.

It is possible to construct simple tests to demonstrate the weaknesses and overhead of temp-tables for situations where their features are not needed and then we could go to PSC with a business case and point out the magnitude of the problem ... if any.  One can't test a work-table of objects, since that isn't possible, but one could test an array of objects, ignoring for test purposes that the array would have issues that might make it unacceptable.  At least it provides an alternative for comparison.  And, of course, one could test a work-table of strings against a temp-table of strings for a head to head comparison with that mechanism.


What is the difference between a "TT with an index" versus "the crude mechanism of going to the overhead of building the set in the desired order" ? Isn't the TT index doing the same thing ?

No.  For the TT, we have a machine language level routine which manages a highly efficient index mechanism, just like that used for database tables, in which one can add values to the index without having to sort or scan all values.  With a work-table, one has to read from the begining looking for the record gap where one record is equal or lower and the next record is higher and do the insert record there.  I suspect that would be quite a bit of overhead if the number of records was significant.

In .Net, we don't have to build a set in a fixed order, we can use LINQ To Objects to sort any sets on any members, on demand.

One interpretation of which is that .NET needed to invent LINQ because they didn't have TT and PDS.

It is worth noting that the OO world view is significantly different that the relational one.  It can be said that, if one is doing "real" OO, there is no need of this relational way of looking at things and, if one is using such mechanisms, it is a sure sign that one is not doing good OO.

Posted by gus on 07-Oct-2009 11:40

I would not characterise it as expected behaviour. /I/ certainly didn't expect it.

You are correct that it "never occurred to anyone that someone would have an actual use of thousands of TT at the same time".

Temporary tables were designed and added to the language more than 15 years ago.  For many years, no one had thousands of temporary tables at the same time and we never contemplated that possibility.  Even if we had, I'm quite sure I would have dismissed it as not worth worrying about.

There are tradeoffs in everything.  Usage often changes considerably over time and what was once a good solution to a different problem becomes inappropriate for another because the tradeoffs that were made for the earlier problem don't fit the new one.

There's been a long discussion about "delayed instantiation of temporary tables".  I no longer remember what problem that was a purported solution to.  But I do remember thinking that it was not the right solution.  Now we have a discussion about collections, which is a different problem entirely.  Using temporary tables as part of a workaround to not having collections is (not that good of) a solution and we can certainly debate the efficacy and appropriateness of it.  But don't lose sight of the problem.

I encourage you all to be clear about the actual problem that needs to be solved.  In these rambling discussions, problems often exhibit constantly changing morphism and I become confused and lost.

Posted by Thomas Mercer-Hursh on 07-Oct-2009 11:56

There's been a long discussion about "delayed instantiation of temporary tables".  I no longer remember what problem that was a purported solution to.

The specific problem that triggered that proposal was a site where Tim Kuehn was working that had a number of temp-tables defined in include files and the include files scattered liberally through the application.  The problem arose because one include file contained multiple definitions, but the program in which the include file was used might only use one of the temp-tables in that set ... the rest were referred to nowhere in the code of that module.  The result was a much larger number of temp-tables being *defined* than used.  Tim was able to improve the situation significantly by reducing the block size ... apparently the problem appeared on a version change where the default block size increased.  Tim feels that lazy instantiation would mean that all the unused temp-tables would not actually get instantiated.  Others of us felt that the more approprate solution was refactoring to eliminate definitions which were not used.

No refactoring or lazy instantiation  helps when one defines thousands of temp-tables which are actually used.  This has been seen in some of the early experiments with OO where people defined instance entities with a one row temp-table in order to take advantage of WRITE-XML and READ-XML.  In the worst case, classes were defined for every widget, each with a temp-table.  To me, the solution here is simply, "don't do that".  It is an abuse of temp-tables.  Far better to lobby for PSC to provide a WRITE-XML for object properites.

There are two core questions in this thread (and a number of distractions).

One is whether there is a significant need for a lightweight collection mechanism.  As lightweight collections are a pretty core OO concept, there is a reasonable presumption that they are going to be important in OO programming, but what is not well established is whether using temp-tables even for cases in which the attributes of a temp-table are not really needed presents a meaningful performance issue.  That is testable.  I don't believe that best practice has a risk of manifesting the TMTT problem, even if temp-tables are used for all collections, but I could be proven wrong.

The second question is whether there are any existing language features which could be used for collections that were lighter weight than temp-tables.  The two which I have noted are arrays, which can hold objects, but which present issues of having a fixed extent, and work-tables, which behave a lot like 3GL collections, but which do not presently support a datatype of Class.

Posted by Admin on 07-Oct-2009 22:45

When ABL Collections with a Temp-Table for the internal data management become widely used, I believe that the best technique to avoid the "TMTT" issue is the use of a single static temp-table (shared by all Collection instances).

DEFINE PRIVATE STATIC TEMP-TABLE ttCollection NO-UNDO

    FIELD OwningInstance AS Progress.Lang.Object

    FIELD Key AS CHARACTER

    FIELD Index AS INTEGER

    FIELD MemberInstance AS Progress.Lang.Object

    INDEX OwningInstance OwningInstance.

Using that approach all Collection class instances are sharing a single temp-table. I know it's breaking encapsulation at instance level and I'm sure I'll receive critics for it, but it works and is accepted by a number of clients already.

Within the Collection instance, I make sure (there's only a hand full of accesses anyhow), that I always use OwningInstance = THIS-OBJECT when creating a new ttCollection record or accessing them. In the destructor, I delete all records belonging to the disposed Collection instance.

Since the core Collection implementation is done in a central generic place, the risk of an unexperience developer breaking this code is minimal.

Posted by Admin on 07-Oct-2009 22:50

tamhas schrieb:

The second question is whether there are any existing language features which could be used for collections that were lighter weight than temp-tables.  The two which I have noted are arrays, which can hold objects, but which present issues of having a fixed extent, and work-tables, which behave a lot like 3GL collections, but which do not presently support a datatype of Class.

I'd be surprised if there would be another construct then temp-tables or arrays available in the language TODAY.

Posted by Admin on 07-Oct-2009 22:54

guilmori schrieb:

I still have to see concrete examples of your "avoiding TMTT" patterns, but for me, this means coding around a problem, ie: patching.

See my comment about the static temp-table, and YES, it's patching.

Posted by jmls on 08-Oct-2009 00:34

The thing with this method, is that it, well,  just works. And works well.

I have almost the exact same structure with the difference of having the owning instance as a GUID, not object as I have some .p / .w that need to use the structure

Posted by Admin on 08-Oct-2009 01:13

jmls schrieb:

The thing with this method, is that it, well,  just works. And works well.

I have almost the exact same structure with the difference of having the owning instance as a GUID, not object as I have some .p / .w that need to use the structure

For the procedural use cases, I prefer using the Collection class from the procedure. In that case the owner of the records in the static temp-table is the Collection instance which is used by the procedure.

Posted by jmls on 08-Oct-2009 01:27

True.

Posted by Thomas Mercer-Hursh on 08-Oct-2009 11:36

Well, you knew in advance that I wasn't going to like the use of a common static object, but let me ask you this ... is there a TMTT problem using TTs for collectionss?  The reports I have heard about TMTT seem to be in the thousands and even there, good use of startup parameters makes the problem less.  Are you really likely to have thousanda of collections active in a session at once?   The only place where I have anticipated a TMTT problem is either by simply using a TT for everything or some form of compounding, i.e., temp-tables which contain temp-tables.  Do you have experience of TMTT driving you to this solution or are you possibly avoiding a problem which doesn't exist.

In any case, there are four other obections I would make to this approach.

One, of course, is that the work-table solution would avoid the problem for unstructured collections and not require this unattractive work-around.

Second, a generic temp-table like this is only going to work for generic cases since you have a key and an index only.  I.e., there is no option to fit the temp-table to the solution.

Third, you have a key and index which are not needed for the unstructured collection, which was the focus of this thread.

Fourth, if one is interested in model to code, one should recognize that collections are not represented as such in UML models.  All one sees in the model is a one to many relation.  Under the covers, collections are used as generic infrastructure to provide a container for expressing this relation.  This means that the code transforms need to be able to deal with a limited number of alternatives and apply the correct implementation for any given instance.  Yes, one could probably provide decision points and alternate templates, some of which would use this structure, but it is not a complete solution.  In particular, it is overkill for an unstructured solution and provides no solution at all for objects which have multiple, simultaneous one to many relations, e.g., order to order line where they are linked by line number, product group, discount category, shipping address, etc.

Posted by Thomas Mercer-Hursh on 08-Oct-2009 11:38

Work-tables are there ... it is just that no one bothered to allow them to use a class datatype because it is a deprecated feature.  Well, maybe it should be undeprecated.

Posted by Thomas Mercer-Hursh on 08-Oct-2009 11:40

YES, it's patching.

So, when we write up the use case for improved language support, we cite this approach under the "Existing Workarounds" section to illustrate how much better things would be with better language support?

Posted by Admin on 08-Oct-2009 12:36

So, when we write up the use case for improved language support, we cite

this approach under the "Existing Workarounds" section to illustrate how

much better things would be with better language support?

Feel free to do so. I'd be happy if I could use instance level encapsulation with my Collection classes.

I'm glad I have only a single place to change this once the language supports it better

Posted by Admin on 08-Oct-2009 12:47

Are you open for a (long time) bet?

I bet that we will see real collections and generic classes in the language before undepreation of work-files.  And I guess that's good so.

Posted by Thomas Mercer-Hursh on 08-Oct-2009 12:50

No bet, since I am afraid that the time frame may be never for both.  But, you can't blame me for asking if this would be an easy fix that someone could slip in some afternoon.

Posted by Admin on 08-Oct-2009 13:09

tamhas schrieb:

No bet, since I am afraid that the time frame may be never for both.

I'm not that pessimistic... OOABL moves slowly be steadily.

But, you can't blame me for asking if this would be an easy fix that someone could slip in some afternoon.

I appreciate any effort!

Posted by Thomas Mercer-Hursh on 08-Oct-2009 13:42

OOABL moves slowly be steadily.

Indeed, while it has taken a few years, the progress has been quite impressive.  That doesn't mean one doesn't still keep wishing for more, of course.

The reason for being pessimistic has to do with whether or not there is a perceived need.  I.e., what is the response to someone saying "We don't need collections because we have temp-tables"?

To have a response to this, one needs to be able to demonstrate one of the following:

* Use of TTs leads almost inevitably to a TMTT problem;

* Workarounds are not merely unaesthetic, but don't really do the job;

* There is a meaningful performance overhead for creating an non-indexed or merely sequential TT which would not exist with other approaches.

I.e., there has to be a solid reason why TT aren't already good enough.

I propose the WT alternative, not because I think it is an ideal solution, but because it seems trivial and thus someone might decide to slip it into a release with the explanation that it provides a low impact mechanism for simple collections.  I.e., good bang for the apparent effort.

Posted by Thomas Mercer-Hursh on 04-Jan-2010 13:59

Another context in which to consider pulling work-tables off of the endangered species list is message data objects.  I am thinking here of short-lived messages passed between subsystems to minimize coupling.  If one needs to send multiple rows of data, then a work-table is actually ideal since it only needs to be sequentially created and sequentially read and then it is thrown away.  Using a work-table would allow this to be much lower weight than using a temp-table.

This would work with work-tables as they are today ... but I still want to be able to define fields of type PLO.  I just think that this is a usage which is perhaps particularly characteristic of OO, but could be used in any kind of programming, which shows that there is a reason to preserve work-tables and even enhance them.

Posted by guilmori on 04-Mar-2010 13:06

tamhas wrote:

But, briefly, I don't know that we need PSC to give us collections per
se.  Collection classes in Java are just Java code, not some special
built-in.  I think there are only two problems with us writing and
using our own collection classes.  One is the lack of ABL generics
which means that we have to create a lot of type-specific code.  The
other is that TTs are kind of heavyweight entities for anything other
than cases where one needs a map class.  For simpler collections, I
thnk a work-table might just do the trick, but we need support for PLO
fields.

First, ABL is not Java Ha ha, sorry, I've been told "ABL is not this" so many times lately, I couldn't resist.

I do not mean built-in as a keyword, but meaning that it ships with the product, it is readily available to use for EVERY developers.

I see some more problems in not having "built-in" collections:

- First and obvious, one has to re-invent the wheel by building its own library of collection classes. This is time, time not spent on project development, time that is not available to everyone. Thanks, you already did this work and made it available to the community. But, can we really expect every ABLers in the world will use your collections, no. I think PSC should provide the base, then developers do some specializations if needed.

- Now, if collections would be implemented deeply into the internals of ABL, maybe it could make use of a much more efficient internal data structure than what is available to us.

- I think the ABL could make use of collection interface in their own statements, or in their upcoming class libraries

Posted by Thomas Mercer-Hursh on 04-Mar-2010 14:23

I suppose I am less concerned about it shipping with the product exactly because I have a certain historically-based skepticism about framework components from PSC.  I think the ABL community is capable of developing a repository of solutions.  The key is language features that make it possible and efficient.

I am particularly concerned about something too built in because that implies that it is not modifyable and I don't want to have to wait another several releases to get them to enhance it.

Posted by Admin on 04-Mar-2010 16:57

I suppose I am less concerned about it shipping with the product

Many users are more concerned about this than others.

Posted by Thomas Mercer-Hursh on 05-Mar-2010 11:15

I think the key point is that we need enabling technology more than we need PSC to provide framework classes.  If they implement collections per se as a hard-code language feature, then we are stuck with what they happen to do.  If they create enabling technology, e.g., PLO in a WT, then it is fine whether or not they also supply a framework implementation since we can all subclass or replace or supplement what they provide.

This thread is closed