Usage of BIND, REFERENCE-ONLY and BY-REFERENCE with ProDataS

Posted by Peter Judge on 23-May-2019 15:53

Hi,

As part of some research into improving the ABL, we're looking to understand your usage of (primarily) the BIND functionality in the ABL as it pertains to ProDataSets and temp-tables, as well as generalised cases of binding and passing those data structures around in your application code.



- What are you trying to achieve?
- How do you use this functionality?
- What challenges do you face using the functionality? I know, for instance, that you can't define interfaces with public TT/PDS members so that causes hoop-jumping.
- Anything else pertinent to BIND/REFERENCE passing of data structures?

Thanks for any input you have,

-- peter

All Replies

Posted by Stefan Drissen on 23-May-2019 16:16

I find the biggest challenge the syntax - somehow it escapes me every time and the documentation does not help me.

Posted by Jeff Ledbetter on 23-May-2019 16:48

"What are you trying to achieve?"

Performance. We want to avoid passing around possibly large datasets.

"How do you use this functionality?"

From a procedural/super-procedure POV:

For BIND, a specific use-case is when we have a large dataset that could be used in different ways depending upon the usage context so we'll start the appropriate SUPER and BIND to the instance in the caller. Or, if there some common logic that may act upon a dataset regardless of context, that routine is started super and then BIND is used bind to the instance in the caller. This occurs in some of our MVP patterns.

Another BIND use case is when we want to "borrow" the dataset in a super so we'll OUTPUT dataset BIND to a reference-only dataset in the caller.

We use BY-REFERENCE when the client has sent a dataset to the server. From our service layer, that dataset is then passed along BY-REFERENCE deeper into the application to avoid deep-copy of a potentially large dataset.

"What challenges do you face using the functionality?"

Syntax was somewhat clunky at first and it's not always immediately obvious.

"Anything else pertinent to BIND/REFERENCE passing of data structures?"

Please don't change they syntax. We have it working. :)

Posted by Laura Stern on 23-May-2019 17:58

I agree with Stefan whose "biggest challenge is the syntax".  The functionality is clear.  But the usage of the syntax is not.  Where do you use the word BIND and where BY-REFERENCE?  i.e., On which side of an INPUT or an OUTPUT parameter do you put each keyword.   What are the combinations that are allowed on the different ends of a procedure/method call?  Do you need BIND on one side and BY-REFERENCE on the other, or is it just one of those, and which one for INPUT vs. OUTPUT.  And also on one side the TT or DataSet should be defined as REFERENCE-ONLY. This is not clearly documented anywhere that I know of.  I have tried to do this myself and usually fail to get it to work!  Jeff or Peter, if you know how to use this, please contact the content (e.g., documentation) team and give them a list of the rules so they can document it :-)

Posted by dbeavon on 23-May-2019 19:02

If I use the word "static" below it is in relation to statically defined data; I'm not talking about static membership of a class.

BIND has more explicit declaration requirements.  My understanding is that it has to be used from both the caller and callee.  Whereas passing something "BY-REFERENCE" can be done as an afterthought and different clients of a given program can either use that convention or not.  I suspect that whenever you send BY-REFERENCE data to program, then the runtime for that program will simply dispose of the instance of the static data that it would have otherwise used.

BY-REFERENCE seems to be a more flexible and more easily approached mechanism for static data that is being passed for INPUT arguments and INPUT-OUTPUT .  The only time that a client program absolutely MUST designate that an argument is BY-REFERENCE is if the called program has declared the data to be "REFERENCE-ONLY".  But you probably won't know that is the case until run-time.

On that note, most of the complaint I have with this stuff is that the compiler doesn't help matters much, and leaves you in suspense until you test out your code at run-time.  But this complaint is not isolated to this topic.  Thankfully the OO side of ABL provides much better compile-time assistance.

There are some very *unusual* gotchas that arise when using BY-REFERENCE data and sending it into an OO class where the member data is declared REFERENCE-ONLY (as we normally do in our business logic layer).  The member data in the OO class will now become a reference to the original instance of data that was passed in.  If you then call some other instance methods of the class, the reference will remain in place.  But when the flow-of-control finally *exits* from the original method (the one that received the BY-REFERENCE data) then the OO class will somehow "un-reference" that static data.  Any subsequent call to the class will fail if it tries to access the same REFERENCE-ONLY member data (without supplying it once again as a parameter).  This behavior seemed fairly strange when I first encountered it, especially since ABL does *not* seem to provide any "un-reference" or "un-bind" operation for programmers to use ourselves.  It is also unnerving to have class instance with a reference to a dataset one moment,  and then on a subsequent call to the same OO class, you find that the dataset has vanished, simply because of the flow-of-control (unwinding of the callstack).  Its almost like the person who came up with mechanism wasn't able to decide if the data was going to be a true member of the class or a true parameter on the stack.  They decided to walk a confusing path between the two different concepts.  Maybe we just need yet another keyword for BY-REFERENCE-STICKY. ;-)

As far as BIND goes, I avoid that wherever possible, since the pattern is ugly, and less familiar, and in most cases can be replaced with BY-REFERENCE ( / REFERENCE-ONLY).

The place we use BIND is to *permanently* "borrow"/"share" a dataset from a class that declared it *without* "REFERENCE-ONLY" to one that declares it *with* "REFERENCE-ONLY".  We will declare data *without* the "REFERENCE-ONLY" keyword is in a remote class whose only purpose in life is to be used as a "dataset factory".  It creates and distributes new instances of static data.  But this means that any business logic class can then call the factory and ask for the static data via "OUTPUT DATASET DS_GiveMeYourFreshData  BIND".

The nice thing about BIND is that creates a "sticky" reference and the business logic class won't lose hold of that reference, once it has been acquired.

To summarize, (and I hope this doesn't sound too negative) I think the biggest complaint is all the false confidence that you get from the compiler, and have to wait until run-time to see what you did wrong.  Even then, some of the runtime error messages related to static data parameters are very, very bad - they are some of the worst runtime messages you will ever see from the AVM.  Next, I think the implicit "unbinding" of BY-REFERENCE parameters from an OO class instance is very unfortunate.  It seems to happen arbitrarily when the flow-of-control finally exits the original method that provided the input data  Finally, I think that it is unfortunate is that the default behavior for passing static data is NOT by reference.  Maybe we could start over with a whole new mode where all the default parameter passing of static data is by reference.  Then you could have a new set of keywords (DATA-COPY, or BY-VALUE) when you really wanted your static data to be copied by value.

I am somewhat biased in my opinions since I do just as much programming in .Net as I do in ABL.  Maybe someone who works in ABL all the time (and avoided the OO-ABL stuff) would not have as many complaints as I do.

Posted by Jon Brock on 23-May-2019 19:16

We are using BIND for performance and to facilitate sharing and manipulation of temp-table default buffers. We create classes that implement the DAO pattern. However, rather than a class that encapsulates each business entity, the data is stored in a dataset or temp-table and the DAO classes wrap them. BIND functionality allows us to:

1. Access buffers statically, which allows for compile time checking and makes the code easier to read and maintain.

2. Treat the DAO classes as lightweight disposable objects since they don't entail temp-table initialization

3. Share the default buffer between the business procedure and the DAO class, which again makes the code easier to read and maintain

To illustrate with code:

oCustomerDao = new SomeDao(table ttCustomer bind).

oCustomerDao:LoadById("blah").

ttCustomer.x = y.

oCustomerDao:UpdateCustomer().

oCustomerDao:LoadAllModifiedSince(today - 7).

for each ttCustomer:

   ttCustomer.x = y.

   oCustomerDao:UpdateCustomer().

end.

One thing that would be a great enhancement (that actually has nothing to do directly with BIND, but just how we use it) would be to have static buffer access for before-tables. For instance it would be great to do:

WritePreTransValidate():

   define buffer bOldCustomer for ttCustomerBefore.

   bOldCustomer:find-by-rowid(buffer ttCustomer:before-rowid).

   if bOldCustomer.Email <> ttCustomer.Email then do: etc

It would be even better if you could do:

   define before-buffer bOldCustomer for ttCustomer.

And then bOldCustomer would always point to the appropriate before record for the current ttCustomer buffer.

Posted by Tim Kuehn on 23-May-2019 20:36

I've avoided BIND until a customer mandated it's use.

The main issue I had with it is was the difference in how it behaved vs other TT models - particularly when it came to passing a BUFFER or doing a DEFINE BUFFER in a procedure / at the procedure level. These constructs would work in non-bind scenarios, I'd like to see them work in a BIND scenario. (I'd note that the version I'm using may not have the latest updates which might've fixed some of these issues.)

What I'd really like to see from BIND is the ability to share a TT instance across multiple objects / procedures, and give each instance their own default buffer. Calling one routine and having it's actions change the buffer pointer in another routine is a problem I'd like to see gone. :)

I'd also like to see consistent syntactical treatment between BIND, BY-REFERENCE, and straight TT.

Posted by Laura Stern on 23-May-2019 20:42

Re bdeavon's comment:

There are some very *unusual* gotchas that arise when using BY-REFERENCE data and sending it into an OO class where the member data is declared REFERENCE-ONLY (as we normally do in our business logic layer).  The member data in the OO class will now become a reference to the original instance of data that was passed in.  If you then call some other instance methods of the class, the reference will remain in place.  But when the flow-of-control finally *exits* from the original method (the one that received the BY-REFERENCE data) then the OO class will somehow "un-reference" that static data.  Any subsequent call to the class will fail if it tries to access the same REFERENCE-ONLY member data (without supplying it once again as a parameter).

There's nothing strange about that.  The data is not copied.  That's the whole point.  Once we return from the method, we can also return from the caller, where the data actually lives.  So of course, we can't reference the data any more if we call the method again  (without supplying it once again as a parameter).  The data could be gone.  Of course, maybe the caller hasn't returned.  Maybe the caller is still there and calls the method again.  But the AVM doesn't know that, and I don't see why it should.  Just pass in the same TT again.  If you want the data to be there all the time once the call is made, then the data needs to be copied.  That makes it "sticky"!

Posted by jquerijero on 23-May-2019 20:52

[quote user="Jeff Ledbetter"]

"What are you trying to achieve?"

Performance. We want to avoid passing around possibly large datasets.

"How do you use this functionality?"

From a procedural/super-procedure POV:

For BIND, a specific use-case is when we have a large dataset that could be used in different ways depending upon the usage context so we'll start the appropriate SUPER and BIND to the instance in the caller. Or, if there some common logic that may act upon a dataset regardless of context, that routine is started super and then BIND is used bind to the instance in the caller. This occurs in some of our MVP patterns.

Another BIND use case is when we want to "borrow" the dataset in a super so we'll OUTPUT dataset BIND to a reference-only dataset in the caller.

We use BY-REFERENCE when the client has sent a dataset to the server. From our service layer, that dataset is then passed along BY-REFERENCE deeper into the application to avoid deep-copy of a potentially large dataset.

"What challenges do you face using the functionality?"

Syntax was somewhat clunky at first and it's not always immediately obvious.

"Anything else pertinent to BIND/REFERENCE passing of data structures?"

Please don't change they syntax. We have it working. :)

[/quote]

This is pretty much how we use the features. The main headache I normally encounter is making sure that the proper keyword (BIND, BY-REFERENCE) is included in parameter definition. If you missed it, it is hard to chase.

Posted by dbeavon on 23-May-2019 22:40

>> Just pass in the same TT again.  If you want the data to be there all the time once the call is made, then the data needs to be copied.  That makes it "sticky"!

It was the first time I had ever encountered a scenario where a static TT stopped referencing something which had previously been referenced.  I suppose the upshot is that it makes our business classes slightly more reusable - if you can rebind their static references again to some totally different TT.  But that doesn't come up very often, and I would typically just create a new instance of a logic-class if/when I need to provide it with a different instance of a TT.

Its odd that the runtime is able to unbind the static TT when the method returns.  As far as I know there is no way for a programmer to do that ourselves.  If the runtime can do it then you would think ABL language would have a statement to do the same thing.

Another "sticky" approach would be to use BIND.  For example if I have two logic-class instances A and B.  Both have reference-only data.  And instance A creates the logic-class B and wants to pass it a shared static dataset then it can transfer the reference permanently via a BIND parameter (rather than simply trying to pass it in with the "BY-REFERENCE" keyword).  The references all point to the same underlying data, and the references will remain in place for as long as the logic-class exists.

The BIND approach is more permanent.  But conceptually I would have expected them to behave in a similar way.  It does not feel right for the runtime to take away a dataset that had been referenced by class's member.  Its like a rug getting pulled out from under your feat.  By the time it is taken away, the dataset reference has become an integral member of the class, and may have been used from the context of *numerous*  private methods (without necessarily passing it around as a parameter).

Things would be less confusing if there were a way to scope the static dataset declaration to a *single* method, rather than to the class as a whole.  Then it would be very clear why the dataset reference is lost after the method returns.  I think the current behavior tries to pretend that the data wasn't actually adopted as a member of the class for a period of time.

Posted by David Abdala on 24-May-2019 10:41

I agree that documentation of BIND and BY-REFERENCE is very confuse, and it takes some time, and pain, to get to do with it what you actually want.

We use it extensively for NSRA framework, where DAO classes are subdivided in Main, and Subordinate classes.

Main DAO classes define a PRODATASET and a set of  "regular" TT to hold a full DataEntity, whereas a DataEntity is a Bussines "whole" concept, like an "Invoice DataEntity" which is all the information relevant to an invoice in the bussines language: Head, Detail Lines, Customer, Shipment, Receipts, etc.

Each DataEntity is hold by only one DAO class, which is responsible only for the "main" TT of the data structure. For each other TT of the DataEntity a Subordinate DAO is instantiated, and BINDed to the TT which this instance is responsible for.

Each TT is defined in an include file, with an "optional" BY-REFERENCE, so TTs are included in the Main DAO class as regular TT, and in the Subordinate DAO class as BY-REFERENCE.

Subordinate DAO classes BIND to the TT in the constructor, so the data is available throughout the lifecycle of the object. The lifecycle of Subordinate DAO objects is under Main DAO object control, so there is no way of having an "unbounded" BY-REFERENCE TT in a Subordinate DAO object.

As only one object operates on every TT, there is no DEFAULT-BUFFER conflict, besides "regular" conflicts, unrelated to BIND and BY-REFERENCE.

We don't use it anywhere else, as others stated, it is a "dangerous" functionality, as only explodes at runtime, and in very difficult to debug ways.

Most of the use of BIND and BY-REFERENCE is due to the fact that "you" (read Progress) never implemented trueTT objects, from which you can INHERIT. I'm been waiting for this since 2009... luckily I'm sitting most of the time...

Posted by Peter Judge on 28-May-2019 19:07

Thanks to everyone who replied: at the very least there's _huge_ scope for improvement in the docs from all the comments here. Maybe also in the syntax. I'll pass this thread on to the appropriate teams for their edification and delight.

I'm also getting that BIND helps manage/pass a single copy of a temp-table's data.

Posted by Tim Kuehn on 28-May-2019 21:01

[quote user="Peter Judge"]

I'm also getting that BIND helps manage/pass a single copy of a temp-table's data.

[/quote]

Strictly speaking - BIND and BY-REFERENCE don't pass anything but a TT reference.

BIND permanently associates the target instance's TT references with the source's TT reference.

BY-REFERENCE associates the target instance's TT references with the source's TT reference for the duration of the call.

On the other hand, a straight INPUT TABLE tt_Table parameter does a call-by-value which creates a duplicate copy of TT data. 

Posted by jquerijero on 28-May-2019 22:00

Now that is taken cared of, let's talk about the implicit temp-table and dataset copy being created that don't go away when passing them to a procedure/method that expects a table/dataset HANDLE. :)

Posted by Torben on 29-May-2019 09:29

In general I'll like to see dynamic temp-table, dataset behave more like JsonObject.

All passing is by reference

When passing AppServer boundary it is serialized/desecialized automatically on the call.

If you really need a copy, you need to clone.

When not referenced any more it is gc'ed

(And preferably implemented as ttobject and dsobject)

Posted by Blake Stanford on 29-May-2019 14:10

Jeff Ledbetter wrote: We use BY-REFERENCE when the client has sent a dataset to the server. From our service layer, that dataset is then passed along BY-REFERENCE deeper into the application to avoid deep-copy of a potentially large dataset.

I'm not sure if you are reffering to an AppServer when you said "sent a dataset to the server", If so then I my understanding is that anytime the client/appserver boundary is crossed, a DEEP COPY is always performed.  The AVM creates an implicit table on the appserver and copies the dataset/temptable.  Caution, in the documentation there are several gotchas that cause the implicit dataset/temptable to not be deleted, leaving a memory leak.  Here is a link to a KB about it.

P112397, "When does Progress create dynamic TEMP-TABLEs automatically"

Posted by Jeff Ledbetter on 29-May-2019 14:24

Blake, yes we use TABLE-HANDLE and DATASET-HANDLE for our input/output params to our proxy layer. We are quite diligent about cleaning all of those up. :-)  After the data comes across, we try to by BY-REFERENCE to pass around the business logic (or the handle itself when dynamic access is involved). Fun stuff.

Posted by marian.edu on 04-Jun-2019 05:22

[quote user="Tim Kuehn"]

Strictly speaking - BIND and BY-REFERENCE don't pass anything but a TT reference.

[/quote]

That probably just make it more confusing, it's really not the same thing and albeit I've used BIND before (damn it's so tempting, much like static) now I see it more like one of those features that creates anti-patterns :(

[quote user="Tim Kuehn"]

BIND permanently associates the target instance's TT references with the source's TT reference.

[/quote]

'Permanently' there is more relative, if the reference you bind to is dynamic that will just go away when deleted. Oddly enough for a static object that will remain valid until everyone that has a reference to it is still alive, even if the class/procedure that instantiated it was deleted... much like a shared variable :(

Probably the only use-case for BIND is when you need to wrap a table/dataset in an object - either a 'request' object received over an appsrv call or coming from the data access layer. One way to get around this is to inject the data structure in the constructor using by-reference and store only it's handle then use that 'reference' by passing it as table/dataset handle to internal methods using by-reference. That way you always know if your reference is still valid and you can use static data access in internal methods.

Why table/dataset-handle parameters needs by-reference I have no idea, for me this should be the default for 'handles'. Input-output could, arguably, also default to by-reference although the only difference is that if the routine throws an error the original data structure remains as it was before the call.

class MyRequest: 
    {ttCustomer.i &reference-only=reference-only}
	
    define private variable bindTableHandle as handle no-undo.
		
    constructor public MyRequest ( table for ttCustomer ):
        bindTableHandle = temp-table ttCustomer:handle.	
    end constructor.

    method public character getName (custNum as integer):
        if valid-handle(bindTableHandle) then 
            return getName(custNum, table-handle bindTableHandle by-reference).
    end method.
    
    method protected character getName (custNum as integer, table for ttCustomer):
        for each ttCustomer where ttCustomer.custNum = custNum:
            return ttCustomer.name.
        end.
    end method.
     
end class.

Posted by Tim Kuehn on 12-Jun-2019 14:36

Marian -

By "permanently" I mean that the TT association between the caller and the callee extends past the duration of the call. That a static TT BIND can persist past the life of the source instance is - puzzling at the least, and dangerous at worst. 

The main use-case I have for BIND is the customer uses it in a procedural model. As such I don't have to worry about associations lasting past the duration of a procedure call. Beyond that I think it's intended for cases where multiple persistent procedures / classes are supposed to work together. I can see this for some situations - but I think the potential dangers of persistently binding like this outweigh any perceived benefits. 

Why table/dataset-handle parameters needs by-reference I have no idea,

Because they behave the same as their static counterparts and can be passed to / from a static TT / dataset parameter.

This thread is closed