Rowids and Table Partitioning

Posted by Richard Banville on 20-Mar-2014 15:07

Rowids used in conjunction with their table number are unique identifiers of a specific record in the database.                                                        

Rowids in and of themselves (without the table information) are not unique per database since the implementation of storage areas in OpenEdge some time ago.

The rowid encodes information identifying the area block and record within that block.  The associated table information allows for mapping to the appropriate storage area within the database.

Unique record identification is critical to the operation of the database since it allows the storage engine to implement such things as record locking and proper record retrieval either directly or via index lookups (among many other things).  This uniqueness also allows the ABL to properly sort data for certain queries and properly maintain a result set (among other things).

With the implementation of table partitioning in OpenEdge, the current rowid information plus the table number is no longer sufficient to uniquely identify a specific record in the database.  Uniqueness is obtained by also including the partition id associated with the record.

The internal format of a ROWID has been extended to include the partition information.

The information associated with a RECID however will continue to only contain the encoded block and record within block information.  

All Replies

Posted by Richard Banville on 20-Mar-2014 15:08

So why is this important to you?

The new format of the ROWID provides a seamless transition for ABL code that does such things as:

myRowid = ROWID(customer).

Find customer where ROWID(customer) = myRowid.

(This is just one simple example of the proper use of ROWIDs.)

However, the format of a “RECID” will not (cannot) be extended to include this information. So code such as:

myRecid = RECID(customer).

Find customer where RECID(customer) = myRowid.

Should not be used if the table is partitioned as it most likely will not return the data expected (unless it is located in the initial partition 0 used for migration but even then it should not be used since data can migrate to another partition over time).

What should you do?

Prior to adopting table partitioning, eliminate the use of “recid” in your application prior to deployment.

Posted by Richard Banville on 20-Mar-2014 15:10

ROWIDs can change at runtime.

Prior to the implementation of table partitioning, the rowid of a record does not change for the life of the record; that is after it is created and until it is deleted the rowid is always the same.  In other words, any field in the record can be updated without the storage engine needing to change the unique identifier of the record.

With the implementation of table partitioning, this is not always the case.  If an update to a record involves changing the value of a partition aligned field and that change associates the record with a different partition definition, then the record must be re-located from its current partition to its new proper location in a different partition.  This is actually performed by an internal delete/insert combination.

The relocation of the record in this way may move the record to a different partition in the same storage area or even relocate the record to a completely different storage area depending on your partition definitions.  In any event, this will certainly change the unique identifier required to locate and maintain the record.

So why is this important to you?

If your application code “remembers” the rowid value across such a record update, it will not be able to find the record using the “remembered” rowid since its unique identifier has changed do to a relocation of the physical record.

This has an obvious side effect for “finds” mentioned in a previous post but can also affect the way ”result set” traversals behave in the ABL.

The ABL maintains the records in a result set by maintaining a list of ROWIDs for the records of that result set.  Prior to the implementation of table partitioning in OpenEdge, the only way a record could “disappear” from a result set was if it were deleted.  Even if the record were changed such that it no longer satisfied the open query, it would still remain accessible in the result set.

However, now, for a partitioned table, an update which relocates a record will cause the record to be inaccessible via the REPOSITION operation.  One must re-open the query in order to see the updated record in the result set (of course that is if the new values in the record still satisfies the query).

There has been a recent update here in that only REPOSITION will not work.  The next/prev will continue to work even if the record has been relocated.  Realize that this behavior is not available in the current ESAP but will be there in FCS and hopefully in the next TP ESAP drop.  So a NEXT/PREV will still find the record in the result set.

What should you do?

Realize that this can happen is the most important thing.  Also, you should avoid changing the partition aligned fields as a “usual” practice.  Not that there won’t be times when it is necessary, but it should be something that is not a regular practice if you typically access records in the way described above.

Posted by Jeff Ledbetter on 20-Mar-2014 15:52

What about ROWIDs stored as character variables and then converted using TO-ROWID()? Seamless as well?

Posted by Jeff Ledbetter on 20-Mar-2014 15:54

"If your application code “remembers” the rowid value across such a record update, it will not be able to find the record using the “remembered” rowid since its unique identifier has changed do to a relocation of the physical record."

Wouldn't that (in theory) break any application using ADM2?

Posted by Richard Banville on 20-Mar-2014 16:10

Yes, that will be fine as long as the record is not relocated

Posted by Richard Banville on 20-Mar-2014 16:11

We have discussed the affect to ADM2 internally and other such things.  It is still unclear to me.  I have asked internally for a more appropriate responce to your post so hang on...

Posted by Richard Banville on 20-Mar-2014 16:13

Yes, as long as the record they point to has not changed location.
 
[collapse]
From: Jeff Ledbetter [mailto:bounce-jeffledbetter@community.progress.com]
Sent: Thursday, March 20, 2014 4:53 PM
To: Beta.OE11.4@community.progress.com
Subject: RE: Rowids and Table Partitioning
 
Reply by Jeff Ledbetter

What about ROWIDs stored as character variables and then converted using TO-ROWID()? Seamless as well?

Stop receiving emails on this subject.

Flag this post as spam/abuse.

[/collapse]

Posted by Peter Judge on 20-Mar-2014 16:13

ADM2 only temporarily stringifies ROWIDs, IIRC (for use in the SDO RowIdent field), it does not persist those values to a DB, so for this case it should not cause problems.
 
However,  it might cause problems if you save record and it moves across partitions. I say might because I don't remember the gory details of when the RowIdent field is populated.
 
-- peter
 
[collapse]
From: Jeff Ledbetter [mailto:bounce-jeffledbetter@community.progress.com]
Sent: Thursday, 20 March, 2014 16:55
To: Beta.OE11.4@community.progress.com
Subject: RE: Rowids and Table Partitioning
 
Reply by Jeff Ledbetter

"If your application code “remembers” the rowid value across such a record update, it will not be able to find the record using the “remembered” rowid since its unique identifier has changed do to a relocation of the physical record."

Wouldn't that (in theory) break any application using ADM2?

Stop receiving emails on this subject.

Flag this post as spam/abuse.

[/collapse]

Posted by Laura Stern on 20-Mar-2014 16:13

You can still use TO-ROWID to convert a ROWID value stored in a CHAR variable back to a ROWID.
 
Laura
 
[collapse]
From: Jeff Ledbetter [mailto:bounce-jeffledbetter@community.progress.com]
Sent: Thursday, March 20, 2014 4:53 PM
To: Beta.OE11.4@community.progress.com
Subject: RE: Rowids and Table Partitioning
 
Reply by Jeff Ledbetter

What about ROWIDs stored as character variables and then converted using TO-ROWID()? Seamless as well?

Stop receiving emails on this subject.

Flag this post as spam/abuse.

[/collapse]

Posted by Jeff Ledbetter on 20-Mar-2014 16:18

However,  it might cause problems if you save record and it moves across partitions. I say might because I don't remember the gory details of when the RowIdent field is populated.”
 
That was what I was implying by “in theory”. J
 
I cannot imagine someone partitioning on such a value but it could happen..
 
 

Posted by Håvard Danielsen on 20-Mar-2014 16:44

The ADM2 has a RowIdent field in its internal temp-table. This is a comma separated character field that consists of the database rowid(s) of the corresponding table(s) in the database. This is used to find the corresponding database record(s) whenever the record is saved to the database. This is currently not refreshed on save. The plan is to change this to be refreshed on save in 11.4, so that subsequent saves can work also after a partitioned value is changed.  

Another known problem is that the optimistic locking check uses the client's rowid to find the current record for the comparison, so you would get a record not found instead of the optimistic lock error if the partitioned values have been changed by someone else.  The plan is to fix this in 11.4 also.

--

There is also a potential problem in that the batching logic uses rowids in the properties that points to the first and the last record in the batch. If someone else changed the partitioned values of these two records the batching would fail. There is no plan to fix this. Note that this is not a new problem, the batching would also fail if  if these records are deleted by someone else.  

Posted by Thomas Mercer-Hursh on 20-Mar-2014 16:56

So, the implication is that someone running a customized ADM2 who might previously brought forward the base and customizations from a prior release, could not do that with 11.4 if they planned on partitioning?

Posted by Håvard Danielsen on 20-Mar-2014 17:27

Obviously, If you use an old version of ADM2 you have to update the code yourself. I believe that would be quite easy, if you have the 11.4 source, but it might depend somewhat on how old the version you have as base is.

Posted by Marian Edu on 21-Mar-2014 06:02

Maybe you can consider a 'row movement' option for partitioned tables just like Oracle have, leave that as disabled by default and no rowid change will occur as the record remains in the same partition where was created and no indexes need to be updated either... you guys seemed to be very forgiving in the past with all that back-ward compatibility, sure you will find a way to smooth the transition on that one :)

jonathanlewis.wordpress.com/.../row-movement

as a side note the default 'disable row movement' was the most common case I've found so far but guess in a data warehouse that shouldn't come as a surprise... either way, the row movement in your implementation will probably be the same: delete+insert (with only the 'update' trigger to fire?) and there is a performance cost attached even for oltp

Posted by Richard Banville on 21-Mar-2014 12:04

There are rare cases when a database is damaged beyond repair and no good backup exists (maybe you deleted an extent by accident or there is some other hard failure preventing a database block from being read and someone neglected to validate the backups).  In such a circumstance if you want to dump as much data as you can out of the database you can try binary dump or ascii dump by table in the affected area and load the result into a newly created database.  However, when those tools access the damaged part of the database they will most likely fail.

It may be that there is data in good format beyond the failure point; it’s just that these tools will not continue looking beyond the damaged location.

An alternative sometimes employed is to dump the data for individual tables using the dump/export by recid mechanism.  For example (and of course there are better ways for determining the last record other than using the maximum and you should use them!):

maxRecid = 9223372036854775807.
DO getRecid = 1 TO maxRecid:
    FIND Order WHERE RECID(Order) = getRecid.
    IF AVAILABLE(Order)
        EXPORT Order.
END.

If a “non-continuable” error is encountered, you can restart the search with the next recid (or the recid in the next block).

Realize that this mechanism is not intended for the “casual user” and can be fraught with problems (data loss) if you are not 100% sure what you are doing. If you find yourself in such a position and do not fully understand the consequences, you should seek professional help;> 

In any event, as previously explained, you cannot use RECID to reference data in a partitioned table.  To solve this, we plan to provide a few static methods allowing you to iterate across a range of rowids for a particular table. The range can be particular to a specific partition or can be used to access records within the rowid range for all partition of a specific table.  A simple example using a partitioned "Order" table follows:

RowidGenerator:TableStart("Order", 0, 5000000, 2).
getRowid = RowidGenerator:GetNextRowid().

DO WHILE getRowid <> ?:
    FIND order WHERE ROWID(Order) = getRowid NO-ERROR.
    IF AVAILABLE(Order) THEN
         EXPORT Order.
    getRowid = RowidGenerator:GetNextRowid().

END.

TableStart parameters: (table name, starting record id, ending record id, partition #)  A partition # of "?" will scan all partitions of the "Order" table for the given rowid range.

NOTE: This is our intended solution for this issue and is still under development.  It therefore should be used for informational purposes only at this point as the syntax and implementation actually released may be different than described.

For those that have used such a mechanism in the past, I look forward to hearing if this will suffice for partitioned tables.

 

Posted by Richard Banville on 21-Mar-2014 13:22

[quote user="medu"]

... either way, the row movement in your implementation will probably be the same: delete+insert (with only the 'update' trigger to fire?) and there is a performance cost attached even for oltp

[/quote]

Yes, only the update trigger will fire.

Yes there will be a performance cost.  Realize that changing an index field in a record for a non-partitioned table results in an internal delete/insert operation for each associated index key today.  The difference in the partitioned world is that the record also requires the internal delete/insert operation where it did not before and the new rowid must be updated in the key entry of each index for the record's table as opposed to only needing to modify the indexes defined as using the fields that changed.

Implementing a "row movement" restriction is an interesting approach.  Of course you'd want the ability to catch an error when causing a "row movement" and the ability to change the setting for maintenance operations.  We'll "kick it around" a bit.  It's not something that would prevent you from adopting our Table Partition mechanism though, correct?

Posted by Håvard Danielsen on 21-Mar-2014 14:03

More on ADM2 and table partitioning:

The intention of the planned changes is to ensure that ADM2 can handle basic save operations on a partitioned database without errors on subsequent reads and saves from the same client.

Note that this does not mean that ADM2 will work without problems on partitioned tables in all cases. There are other read and find operations that are using database rowids as key that will fail if partitioned values have been changed by others after the record was read to the client.

This is mostly the case for methods that are not used internally. We will provide some guidelines on how to avoid the ones we know about and what to use instead. Later versions should have key based methods for most (all?) read and find operations.  

The older the ADM2 version is the more likely it is that rowids are used for no-lock reads. The key based methods may not exist and it may not be easy to retrofit these into an older ADM2 version even if you have the new source available.  

ADM2 visual objects should not have any dependencies on the database rowids, but in older versions the Rowident property that is used as the key to the SDO record also consisted of the database rowids.      

Posted by Laura Stern on 31-Mar-2014 08:17

To expand on this a little...

If you have a saved RECID of a record from a partitioned table and later try to find that record by this Id, most often it will just not find the record.  However, consider the scenario of a table that is still in the process of being migrated, such that some records, destined for their own partition, are still in the composite partition 0.  In addition, the saved RECID is from a record that has already been migrated to its own partition.  In this case, it is possible that the AVM may actually find a record matching this Id, but it will be the wrong record.

To avoid any issues like this, you can grep your source code for all uses of RECID so that you can remove them.  Another way to do this would be to use the -k <forget-list-file> startup parameter.  The forget list is case-insensitive with one keyword per line.  Make sure that there is one and only one carriage return after the last (or only) keyword in the file.  You can then compile all your code and the compiler will error on all uses of RECID.

This thread is closed