Garbage collection for the sake of static temp-tables (in lo

Posted by dbeavon on 28-Jan-2020 00:42

I was wondering how to read this KB.

https://knowledgebase.progress.com/articles/Article/000037476

The only specific place where ABL garbage collection is forced to happen is on return from an AppServer call (deactivate procedure)

Is it saying that there is no ABL garbage collection except in appserver?  Or is it saying that appserver *forces* GC, but other types of clients (like _progres) will perform GC on occasion but it cannot be forced.

I have class that takes a reference to a static temp-table and BIND's to it:

 
USING Progress.Lang.*.

BLOCK-LEVEL ON ERROR UNDO, THROW.

Class dkb.ReproBug.SeededDocumentLogic: 
   
   
   {dkb/ReproBug/Data/LocalSalesTransData.i REFERENCE-ONLY PRIVATE}
   
   CONSTRUCTOR PUBLIC SeededDocumentLogic (  
      INPUT-OUTPUT DATASET FOR DS_LocalSalesTrans BIND):
         
      SUPER ().
      
   END CONSTRUCTOR.
 
   
END Class.

The include dkb/ReproBug/Data/LocalSalesTransData.i is like so:

/* ************************************************************************ */
/* Intial data                                                              */
/* ************************************************************************ */
DEFINE {2} TEMP-TABLE LL{3}_sls_trans NO-UNDO {1}

   FIELD Field1 AS CHARACTER 
   FIELD Field2 AS CHARACTER 
   FIELD Field3 AS CHARACTER 
   
   INDEX LL_sls_trans_01 IS PRIMARY UNIQUE Field1 Field2 Field3
   
   INDEX LL_sls_trans_02 Field2
   
   INDEX LL_sls_trans_03 Field3.
   
   
   
     
      

/* ************************************************************************ */
/* The dataset with the local/session data that we care about.              */
/* ************************************************************************ */
DEFINE {2} DATASET DS{3}_LocalSalesTrans  {1}
 
   FOR 
   
      LL{3}_sls_trans.
      

I use this class in logic loops like so, and don't always run DELETE OBJECT (eg. in the case of unexpected errors).

/* ************************************************************************ */
/* LL_sls_trans - used for composing the data                               */
/* ************************************************************************ */
{dkb/ReproBug/Data/LocalSalesTransData.i}


/* ************************************************************************ */
/* Operation vars                                                           */
/* ************************************************************************ */
DEFINE VARIABLE v_DocLogic AS dkb.ReproBug.SeededDocumentLogic NO-UNDO.

  
/* ************************************************************************ */
/* Do logic in a loop.                                                      */
/* ************************************************************************ */
DEFINE VARIABLE v_Loop AS INTEGER NO-UNDO.
DO v_Loop = 1 TO 5:
   

   /* ************************************************************************ */
   /* Output for this invoice                                                  */
   /* ************************************************************************ */
   EMPTY TEMP-TABLE LL_sls_trans.
         
   /* ********************************************************************* */
   /* SEEMS TO CREATE LEAKS */
   /* ********************************************************************* */
   v_DocLogic = NEW dkb.ReproBug.SeededDocumentLogic(INPUT-OUTPUT DATASET DS_LocalSalesTrans BIND).
   
   /* DO WORK HERE */
  
   /* Clean up Doc Logic   
   DELETE OBJECT v_DocLogic.*/
   
 
   
END.

Unfortunately I find that my static temp-tables "leak" as a result of the reference that is made by the OOABL class.  The error I eventually get is:

SYSTEM ERROR: Attempt to define too many indexes for area 6 database DBI11996a17396. (40) (14675)

I'm assuming the OOABL class instances are leaking too, but the problem doesn't expose itself as severely as the temp-tables that are leaked at the same time. I'm wondering what I need to do to avoid the leaking of static temp-tables.  Is there a way to ensure that GC will run at some point to release the "leaked" temp-tables?  Can I put a pause statement in the code or something, to encourage GC?  Any tips would be appreciated.

All Replies

Posted by Mike Fechner on 28-Jan-2020 02:34

ABL GC will be performed by any other session type as well. It's not specific to the AppServer. As the K-Base article states the ABL GC is (unlike in .NET) almost everytime executed synchronously. The AppServer deactivate seems to be slightly special as GB seems to be enforced here. As to my experience the ABL GC always ever - in it's current implementation - works synchronously I doubt that the special GC activation during the AppServer deactivate event every has to do anything.

The class you have posted here defines a REFERENCE-ONLY dataset/temp-table. That means the temp-table schema is only provided in the source code (include file) so that the compiler can perform strong typed checks against the fields and tables. When your class is executed it won't create an instance of the dataset and temp-table. It will bind to a dataset/temp-table instance you are providing as an argument to the constructor. As the class instance is not the "owner" of the dataset/temp-table the GC will not clean that up when your object instance is GC'ed.

When you experience leaking of that temp-table (and finally too many TT indexes in the temp-table database file), this must be coming from the procedure that contains your temp-table definition without the REFERENCE-ONLY keyword and thus creates the instance of the temp-table.

Persistent procedures on the other end are like dynamic buffer object handles, query object handles, etc. considered "handle based objects" in the language. The GC only works for "class based objects".

Without knowing more about your code, I would assume, that the leak comes form a persistent procedure that defines the temp-table and is not properly cleaned up.

You could also loop through the SESSION:FIRST-OBJECT chain (and then the object's NEXT-SIBLING) to gain insights into loaded object instances. I sometimes use code like this:

DEFINE VARIABLE o AS Progress.Lang.Object NO-UNDO.

o = SESSION:FIRST-OBJECT.

DO WHILE VALID-OBJECT (o):

    LOG-MANAGER:WRITE-MESSAGE (SUBSTITUTE ("&1 &2", o:GetClass():TypeName, o:ToString())).

    o = o:NEXT-SIBLING.

END.

A similar loop can be coded based on the SESSION:FIRST-PROCEDURE handle.

Posted by Torben on 28-Jan-2020 10:04

In general for all ABL widgets is the rule you create it you delete it.

GC is only for the OOABL objects. (And ABL widgets used inside OO you are also responsible to delete)

Widgets include Dynamic temp-tables, dynamic datasets ....

From error it seems like you either have persistent procedures with temp-table that are not deleted.

Or you have dynamic temp-tables / datasets that are not deleted. Following code check for dynamic buffers, including buffers to dynamic temp-tables.

DEFINE VARIABLE hBuffer AS HANDLE  NO-UNDO.   

FUNCTION InstName RETURNS CHARACTER PRIVATE (hWidget AS HANDLE):
   IF VALID-HANDLE(hWidget:INSTANTIATING-PROCEDURE) THEN DO:
      IF hWidget:INSTANTIATING-PROCEDURE = SESSION THEN RETURN "SessionLevel":U.
      ELSE RETURN hWidget:INSTANTIATING-PROCEDURE:NAME.
   END.
   RETURN "Object or deleted procedure":U.
END FUNCTION.

hBuffer = SESSION:FIRST-BUFFER.
DO WHILE VALID-HANDLE(hBuffer):
   MESSAGE SUBSTITUTE("Memory usage (buffer) = &1 created in &2 found record size in bytes &3":U, (IF hBuffer:DBNAME <> "PROGRESST":U THEN hBuffer:DBNAME ELSE "TEMP-TABLE":U) + ".":U + hBuffer:TABLE, InstName(hBuffer), hBuffer:RECORD-LENGTH).
   hBuffer = hBuffer:NEXT-SIBLING.
END.
Posted by dbeavon on 28-Jan-2020 16:16

Thanks for the feedback guys.

The code I posted was fairly complete.  Assuming the last snippet above is put in a program called dkb/ReproBug/SeedLogicWorker.p, you just need to run it repeatedly like so.

DO WHILE TRUE:

           RUN dkb/ReproBug/SeedLogicWorker.p.

END.

This type of loop might exist in a long-running background process or such.  You will see that this program (SeedLogicWorker.p) creates new static temp-tables, and they are leaked rapidly ... resulting in a SYSTEM ERROR about too many indexes.


>> As the class instance is not the "owner" of the dataset/temp-table the GC will not clean that up when your object instance is GC'ed.

Yes, I know that the class is not the owner.  That was deliberate.  This is a logic class, and doesn't actually own the data, but operates on data from a client.  I want the class instance to hold loosely to the table until the work is done, and the instance dies.  When the work is done and the instance dies, I'd like it to be GC'ed and for the static temp-table data to be released as well.  The only two things that are actually "pinning" the static temp-table are : (1) the instance of the class and (2) the calling program (SeedLogicWorker.p).

So how can I put a pause statement or something in the code or something, to encourage GC (assuming that is possible in _progress processes)?  Any tips would be appreciated.  

I was surprised that the problem is so easy to create.  It is a basic static temp-table and a basic OOABL class.  I can do my best to avoid the SYSTEM ERROR by using "DELETE OBJECT" statements, but would prefer not to clutter code with FINALLY blocks that do nothing more than what the GC should already be handling for me.

Posted by jmls on 28-Jan-2020 16:31

[mention:77d0f2ca82a041a08c26cc89b12b968e:e9ed411860ed4f2ba0265705b8793d05]  are you aware that the GC only works with object classes, not handles ?

As you mentioned that sometimes you get errors, perhaps this is of significance ?

knowledgebase.progress.com/.../Implicitly-created-temp-table-not-deleted-when-returning-with-an-error-from-called-procedure

Also, when you mention "static temp-table" is the temp-table definition static ?

Posted by dbeavon on 28-Jan-2020 16:59

>> are you aware that the GC only works with object classes, not handles ?

Yes, its all shown above.  Those programs will work if you copy/paste into PDSOE.  At the core are one simple static tt definition, and one simple class that BIND's to it.  The class instance will cause the static TT to be "pinned" in memory.  And I'm assuming the static TT won't "go away" (for lack of better terminology) until the class instance is first GC'ed.

That last snippet is a program that has to be called in a loop, in order to see the problem.

DO WHILE TRUE:

          RUN dkb/ReproBug/SeedLogicWorker.p.

END.

It is comforting to know that GC is supposed to happen in _progres processes, but I don't understand WHEN I should expect it, or if there is a way to encourage it to happen.  I'm told it somehow happens synchronously but not much beyond that.  I'm assuming that whenever it happens, the static TT will "go away" and prevent a critical SYSTEM ERROR.

Posted by Laura Stern on 28-Jan-2020 17:08

Garbage collection will happen on every statement if there is something to be GC'd at that point.

Posted by Fernando Souza on 28-Jan-2020 17:23

Are you running the code exactly how it's shown above, that is, you don't have anything else in the "do work here" part? What version are you running? If you are always passing the same temp-table from the one .p that has the loop to the class, there is only one instance of the temp-table so it couldn't be leaking, we won't create one inside the class, but instead bind it to the one in the caller procedure, so something else seems to be going on here. It could had been some sort of bug, depending on what release you are running with. I just tried it with 11.7.6 and I see everything seems to be working properly. You can start your session with -clientlog and use -logentrytypes temp-tables:4,dynobjects.class:4 and see if the objects are getting garbage collected. They should be.

Posted by dbeavon on 28-Jan-2020 18:02

Hi Fernando and Laura, thanks for replying.

Yes, the code is exactly what is shown.  Nothing needs to be added to "do work here".  The only additional thing to do is call the whole business in a loop, like shown below.  (I do this in abl scratchpad).  The working-set of the _progres process grows by a few MB per sec and the process crashes after about a minute, and seems to be based on the numbers of indexes used by all the static TT's:

DO WHILE TRUE:

         RUN dkb/ReproBug/SeedLogicWorker.p.

END.

I thought I could encourage GC/unpinning of TT by crossing procedure boundaries, or introducing PAUSE 0.001 statements or something.  But I haven't found the key to making the TT's go away, except if I explicitly use a "DELETE OBJECT" statement.  

I'm using OE 11.7.5.  When you tried with OE 11.7.6 did you use an outer loop to create more than one TT instance?   If so then I will be eager to download and install it whenever that becomes possible. My understanding is that 11.7.6 is scheduled to be released in Q2, 2020, right?

Posted by Fernando Souza on 28-Jan-2020 20:24

Sorry, typo. I meant 11.7.5.

I missed the fact that there was a loop outside calling the .p with the temp-table. I see now. So the problem is that a circular reference is being created. The .p that contains the temp-table can't go away because the temp-table is bound to the object instance, and the object can't be garbage collected because it is in the .p that can't be released. So deleting the object yourself when you are done with it will break that link and resolve the issue.

Posted by dbeavon on 29-Jan-2020 02:13

When control returns from RUN dkb/ReproBug/SeedLogicWorker.p then I'm back in "DO WHILE TRUE".  At that point my custom programming has *no* references to either the static TT *or* the object instances.  

So my question is how do I release resources after the point that execution control has returned to "DO WHILE TRUE"  (eg. lets say that is my outer-most loop for a batch process, and is where I'm willing to put a bit of extra plumbing)

IE. is there any way to trigger the GC to clean up after I've lost track of my own references?  Do I have to create my own GC routine, like what Mike Fechner did above using SESSION:FIRST-OBJECT and looping thru VALID-OBJECT?  That would be unfortunate.  Is there a KB article about these "circular" references by any chance?  I'm assuming that they aren't a problem if everything is consistently object instances, right?  But in our case this is a combination of static data and object instances so perhaps that's what confuses the GC?

I wouldn't have guessed that my simple static data definitions would instigate these types of troubles.  Going back to appserver/PASOE, then in that case would the GC be able to clean up the circular references during the "forced" GC that happens during deactivate?  Or would I be left with a leak there as well?

Posted by dbeavon on 29-Jan-2020 17:47

I did find the KB about circular references.  Apparently the OOABL GC isn't able to clean them, even in the case of regular OOABL object instances.  See

knowledgebase.progress.com/.../What-are-the-limitations-of-the-AVM-garbage-collector

I'm wondering if I should open a new support case so that the KB article can be expanded?  It doesn't say anything about the potential for this issue to happen in conjunction with static data.  Ie, perhaps the related enhancements that are mentioned will not help us where static data is concerned (Enhancements PSC00231977, PSC00327793)

Posted by dbeavon on 30-Jan-2020 20:50

I don't mean to pester but it would be really helpful to know how these "circular references" affect long-lived ABL sessions in PASOE.

Will the so-called "forced" GC be able to clean up the circular references during appserver's deactivate?  Or would I be left with an AVM memory leak on that side of things as well?

Assuming the worst case scenario where ABL sessions in PASOE also leak static temp-tables ... I believe that I already have an effective "memory collection strategy" of my own in place.  I enumerate and trim all inactive ABL sessions for all PASOE applications after every hour.  Is this strategy sufficient to release the so-called "circular references"?  

(IE.  I'm assuming that even though they are "circular", these memory references are all contained within the context of the ABL session. And I'm assuming that trimming the ABL session as a *whole* will finally set the memory free.  The "memory collection strategy" was intended for precisely this type of situation.  I'm optimistic that will do the trick!)

Posted by Fernando Souza on 30-Jan-2020 21:08

GC will not kick in if there are references to the object.  In the case you provided, the .p has a reference to the object instance so GC won't kick in until that .p is released. Normally it would, at the end of the program, but because you have an object bound to the temp-table, to avoid causing failures, the AVM will wait until that object instance is deleted, to remove the dependency on the temp-table, so you are not left with an object that thinks it is still bound to a temp-table after the temp-table has been deleted. Whenever you have procedures/objects bound to a temp-table, you will need to delete the procedure/object that is bound to the temp-table/dataset once it is safe to do so. Or you can just set the object variable to the unknown value.

Posted by Peter Judge on 30-Jan-2020 21:22

(IE.  I'm assuming that even though they are "circular", these memory references are all contained within the context of the ABL session. And I'm assuming that trimming the ABL session as a *whole* will finally set the memory free.  The "memory collection strategy" was intended for precisely this type of situation.  I'm optimistic that will do the trick!)

 
Trimming (deleting/destroying) and AVM session in the agent will remove those objects and tables from memory.  Session-going-away doesn't care about circular references.
 
 
 
 
Posted by dbeavon on 30-Jan-2020 23:01

>> Session-going-away doesn't care about circular references.

Thanks, that is definitely what I wanted to hear.  

We do have a steady leak of PASOE memory in the _mproapsv.exe at a rate of a few 100 MB per day; but this is memory that is inexplicably being leaked *outside* the context of the AVM sessions.  (Presumably this is leaked by some bad Progress native code in the outer portions of _mproapsv.exe, rather than in ABL code.)  I appreciate the confirmation that these OO and static temp-table references aren't contributing to the persistent leak that we see *outside* of our (regularly-trimmed) AVM sessions.

Let me know if anyone would be able to help with forensic analysis of a large  _mproapsv.exe process (eg. 4GB committed in physical ram).  It has *no* AVM sessions in it at all.  I'd love to know what is contained in there.  Whatever it is, it is probably very repetitive!

Posted by Laura Stern on 30-Jan-2020 23:58

You should log an issue with tech support if you want help with your forensic analysis!

Posted by dbeavon on 31-Jan-2020 15:22

I gave them one case with a dump, and then we closed it after a long period of inactivity.  I have tried using WinDbg and doing some small investigation myself.  It appears to be heap allocations in native memory.  Beyond that it is very hard to recognize the contents of that memory.  

PASOE agents are the longest-lived OE processes we have ever worked with, so our memory issues are somewhat more serious than they were in any given _progres CHUI.  And the PASOE agent processes have high utilization and support concurrent sessions, which compounds any memory leak.

Posted by dbeavon on 31-Jan-2020 15:40

>> PASOE agents are the longest-lived OE processes we have ever worked with ...

On that note I think there could be a solution to memory issues that is tailored to the design of PASOE as a whole.  For example, I think as a reasonable and fix, there should be an upper limit on the MSagent's memory usage.  If the _mproapsv process grows above some mind-blowing value like 5 GB, then the ABL application should gracefully abandon that original _mproapsv process and just start a new one.  The original _mproapsv process could just be flagged for deletion after it goes idle.

Unless tech support can prioritize the analysis of a memory dump, my plan at this point is to wait for another large customer to start reporting leaks in PASOE (maybe QAD will start adopting this soon)?  And I can benefit from their efforts.  I've already worked on a few PASOE tech support issues myself.

... and I would spend more time on this too if I could do it productively.  But coming up with a repro is very challenging when the leak is "only" a few MB per hour, and is in native memory that is outside the boundaries of the AVM sessions.  If someone could tell me what is contained in the process memory dump, then it might give me the hint I need to create a repro.  It seems like it has to be a collaborative process, and I cannot build a repro on my own, as things stand right now.

Another thing I'm hoping Progress might do is to eventually enable _mproapsv inactivity timeouts in the PASOE product by *default*.   (It is hard to trust the way we specify this resource timeout configuration in PASOE today, and I suspect that few people are actually using it to cycle their _mproapsv processes .)

Posted by Shelley Chase on 31-Jan-2020 16:57

On that note I think there could be a solution to memory issues that is tailored to the design of PASOE as a whole.  For example, I think as a reasonable and fix, there should be an upper limit on the MSagent's memory usage.  If the _mproapsv process grows above some mind-blowing value like 5 GB, then the ABL application should gracefully abandon that original _mproapsv process and just start a new one.

This is the purpose of the REST and JMX APIs in PAS for OE. You can use an APM* tool that supports REST or JMX. We decided this approach was better since it supported all our platforms and is what is done throughout the industry. See knowledgebase.progress.com/.../How-to-call-into-the-OEManager-s-REST-API-for-insight-into-PASOE

*Application performance management (APM) is a discipline that includes all the tools and activities involved in observing how software and hardware are performing. These tools present that performance information in a form managers and developers can use to make decisions.

Also, if you are on PASOE 11.7.5 or 12.x, there is a memory leak analyzer tool that was demo'd on the CVP. As long as you are a CVP member you can access a video and the tool. See https://community.progress.com/community_groups/openedge_customer_validation_program/m/documents/3729

-Shelley

Posted by dbeavon on 31-Jan-2020 17:11

Thanks Shelly.  I'll check out the video.  Most of articles I've seen about detecting leaks are tied to the ABL sessions (by way of oemanager REST queries or whatever).    

In our case there aren't any ABL sessions in the msagent since they've already been trimmed.  So the results of any oemanager queries against ABL sessions always come back empty.  There needs to be a way to track down the memory that is used outside the bounds of our ABL sessions. That is memory that we cannot be held responsible for anymore as ABL programmers.

This thread is closed