Disconnect users during work partitionmanage split

Posted by Valeriy Bashkatov on 17-Jul-2014 06:05

Hello,

So I disconnected all users from the database. (Why it was necessary to disconnect users? See this post )

Now I can create Partitition Policy table account (from ATM test database)

Step 1

Step 2 (Add fields from index)

Step 3 (Load Details)

Automatically created 10 policies, next I click Finish. At this step all Ok.

Now I allow users to connect to the database.

Current tabanalys:

Now I want to split Composite partition (PUB.account.Initial(Composite)). I go to Edit Partition Policy Details in OE Management. Here I have to set a Split-target flag for each policy.

A small remark: when I clik Update nothing happens, i.e. Edit tab not close. Ok, clik Cancel – closed.

A small remark 2: I need to set split-target flag immediately for all existing policies. But I need to do this separately for each policy. Now I only have 10 policies, but what if I have a few hundred? It is not convenient.

Ok, next. I manually has set split-target flag for each policies and click Commit button. Users still work with database. No problem. Take tabanalys.

That is Ok.

Now I need to split data by partition. Users still work with database. I run command:

proutil atm -C partitionmanage split table account composite initial
OpenEdge Release 11.4ALPHA as of Fri Jun  6 18:21:20 EDT 2014
BEGIN: Split Operation For Table account (17384)
             Source Partition initial[0]
             Target Partition test-1[1]
             Target Partition test-2[2]
             Target Partition test-4[3]
             Target Partition test-5[4]
             Target Partition test-6[5]
             Target Partition test-7[6]
             Target Partition test-8[7]
             Target Partition test-9[8]
             Target Partition test-10[9]
             Target Partition test-3[10]
Index branchid has been identified as the scanning index (selected).
A non-unique index has been selected as the useindex index.
Additional locking is required with the use of this index branchid.
Number of Records per Transaction (recs): 100
Do you want to continue (y/n)?
Y
Waiting to obtain EXCL Partition lock on object 1, partition 0, row 0... (17605)
If you would like to cancel this operation, press CTRL-C. (8581)
... requested lock has been obtained. Operation continuing.
Waiting to obtain EXCL Partition lock on object 1, partition 0, row 0... (17605)
If you would like to cancel this operation, press CTRL-C. (8581)
... requested lock has been obtained. Operation continuing.
Waiting to obtain EXCL Partition lock on object 1, partition 1, row 0... (17605)
If you would like to cancel this operation, press CTRL-C. (8581)
... requested lock has been obtained. Operation continuing.
Waiting to obtain EXCL Partition lock on object 1, partition 1, row 0... (17605)
If you would like to cancel this operation, press CTRL-C. (8581)
... requested lock has been obtained. Operation continuing.
Waiting to obtain EXCL Partition lock on object 1, partition 1, row 0... (17605)
If you would like to cancel this operation, press CTRL-C. (8581)
... requested lock has been obtained. Operation continuing.
Waiting to obtain EXCL Partition lock on object 1, partition 1, row 0... (17605)
If you would like to cancel this operation, press CTRL-C. (8581)
... requested lock has been obtained. Operation continuing.
Waiting to obtain EXCL Partition lock on object 1, partition 1, row 0... (17605)
If you would like to cancel this operation, press CTRL-C. (8581)
... requested lock has been obtained. Operation continuing.
Waiting to obtain EXCL Partition lock on object 1, partition 1, row 0... (17605)
If you would like to cancel this operation, press CTRL-C. (8581)
... requested lock has been obtained. Operation continuing.
Waiting to obtain EXCL Partition lock on object 1, partition 1, row 0... (17605)
If you would like to cancel this operation, press CTRL-C. (8581)
... requested lock has been obtained. Operation continuing.
       Target partition: test-1[1], records moved: 100000.
       Target partition: test-2[2], records moved: 100000.
       Target partition: test-3[10], records moved: 100000.
       Target partition: test-4[3], records moved: 100000.
       Target partition: test-5[4], records moved: 100000.
       Target partition: test-6[5], records moved: 100000.
       Target partition: test-7[6], records moved: 100000.
       Target partition: test-8[7], records moved: 100000.
       Target partition: test-9[8], records moved: 100000.
1000000 records processed. (15165)
       Target partition: test-10[9], records moved: 100000.
       Source partition: initial[0], contains no records.
       Total records processed: 1000000.
 
END: Split Operation For Table account[0]
Split Operation finished successfully. (17359)

It`s looks like good, but …

 

All users, who worked at this time with this table, had been forcibly turned off automatically (see database log).

I think that is another reason why the use of Table Partitioning is not possible in large databases when lots of user online.

 

All Replies

Posted by Dapeng Wu on 17-Jul-2014 08:19

Valeriy,

First, partition split is a very expensive operation, because it involves a lot of data scanning, deletion, creation, index build etc. from both source partition and multiple target partitions. For many database vendors, this operation is actually offline, or will leave the whole table usable till indexes are built later. Although we have tried very hard to improve this feature, we still cannot make partitions involved fully online, for example, to allow full CRUD access on them while split is in process. Because of these, it needs more serious considerations to decide when and how to split partitions, which is different with other "lighter" operations such as idxcheck.

Second, the "waiting on lock" messages indicate there are users working on the partitions that are being split at the same time. Both clients and split process are competing to lock down the partitions involved. Split needs an EXCL partition lock to work properly, but it will release the lock from time to time, for example, after "recs" of records have been split, or the partition is processed. So it does allow some chances for clients to access the partition even though it's working on it.

Third, if a client is not working on the table or partition that is being split, it will not be affected.

Hope this will help.

Dapeng

Posted by Valeriy Bashkatov on 30-Apr-2015 08:39

Tables average size is ~600 Гб. Busines should be work 24x7x365. Users +/- 10 000.

So, it is not possible to implement Table Partitioning in such large system. But for small databases it is not needed.  

Posted by Aidan Jeffery on 30-Apr-2015 09:35

I want to point out that it is not necessary (or should not be necessary) to set the split-target flag on all partitions in a table at the same time. You can therefore limit each partition-split operation to just one, or a small number of partitions. By selective splitting and scheduling of the split operations it may be possible to minimize the impact on the majority of users.

Posted by Valeriy Bashkatov on 30-Apr-2015 09:47

Maybe in the theory is good, but in practice it is a big risk.

Interesting somebody already implemented Table Partitioning in their production databases?

Posted by Aidan Jeffery on 30-Apr-2015 09:55

Do you mean there is (more) risk involved in splitting partitions selectively rather than altogether?
Could you expand on what the risks are?
 
[collapse]
From: Valeriy Bashkatov [mailto:bounce-Arelav2@community.progress.com]
Sent: Thursday, April 30, 2015 10:48 AM
To: TU.OE.RDBMS@community.progress.com
Subject: RE: [Technical Users - OE RDBMS] Disconnect users during work partitionmanage split
 
Reply by Valeriy Bashkatov

Maybe in the theory is good, but in practice it is a big risk.

Interesting somebody already implemented Table Partitioning in their production databases?

Stop receiving emails on this subject.

Flag this post as spam/abuse.

[/collapse]

Posted by Valeriy Bashkatov on 13-May-2015 03:59

Is the risk that the data will not be available to users until split section ends.

Simple example -  table size 600 GB. We have created six sections (List), each section is about 100 GB. What do you think, how much time you will need to split one section? 1 hour? 6 hours? 1 day? more? All this time, the data in this partition will not be available to users, so the business application will not work, because the table interrelated.

I am wondering to know the experience of those who alredy applied TP for tables which size from 500 GB to 2 TB, and with +/- 2000 users in online.

And one more question. When a split partition we know that the data is not available, but how does it look? These data will not be included in the request, or the user will receive a warning?

P.S>

I apologize for not answering at once.

The notification from the Community do not work for me (

Posted by Richard Banville on 13-May-2015 10:58

 
To answer your second question:
 
If partitioning by order date, when attempting to access the order data that is being split such as with the following query:
 
FOR EACH Order NO-LOCK by Order-date:
    DISPLAY Cust-num Order-num Order-date.
 
The user will get a STOP condition with the following error message:
A partition of table “Order‘”cannot be accessed pending completion of database utility operation (17604)
 
Note that the data IS available for query is using a GLOBAL index rather than a LOCAL index.
 
 
 
[collapse]
From: Valeriy Bashkatov [mailto:bounce-Arelav2@community.progress.com]
Sent: Wednesday, May 13, 2015 4:59 AM
To: TU.OE.RDBMS@community.progress.com
Subject: RE: [Technical Users - OE RDBMS] Disconnect users during work partitionmanage split
 
Reply by Valeriy Bashkatov

Is the risk that the data will not be available to users until split section ends.

Simple example -  table size 600 GB. We have created six sections (List), each section is about 100 GB. What do you think, how much time you will need to split one section? 1 hour? 6 hours? 1 day? more? All this time, the data in this partition will not be available to users, so the business application will not work, because the table interrelated.

I am wondering to know the experience of those who alredy applied TP for tables which size from 500 GB to 2 TB.

And one more question. When a split partition we know that the data is not available, but how does it look? These data will not be included in the request, or the user will receive a warning?

P.S>

I apologize for not answering at once.

The notification from the Community do not work for me (

Stop receiving emails on this subject.

Flag this post as spam/abuse.

[/collapse]

Posted by Dapeng Wu on 13-May-2015 12:14

Besides queries using GLOBAL indexes, there are many cases that we do allow access to data that is being split silently. In addition, once a target partition is being split out, its data will also be available right away.

The actual time to finish the split depends on the amount of data to move and the speed of your system. Its major operations involve the followings:

1. read data that needs to be split;

2. create data and indexes in target partition;

3. delete data and indexes from source partition.

These steps are similar to what tablemove does, so that might give you some idea how much time it will take.

Another common way to move data is dump/load/build index. Because this solution needs extra I/O on dump files, split should be faster compared to it.

Dapeng

Posted by TheMadDBA on 13-May-2015 12:30

The current state of partitioning in OpenEdge isn't the best for large databases/tables. Either you have to dump and reload (outages) or you just don't implement it for large databases.

Having to kick users out combined with the amount of time required to actually split partitions and the vast amounts of AI generated (with the resulting issues for hot spares) make splitting online an unrealistic option for most large installs.

Most of the people that can realistically implement partitioning (today) are not the ones who actually need it.

Still... It is a good starting point and I have hopes that PSC will continue making improvements on this and all of the online utilities.

Posted by Richard Banville on 13-May-2015 12:46

We tried really hard to ease TP migration.  There are many different deployment scenarios that must be considered.
 
Not generating AI data  when moving data within the database will cause OE replication to be invalidated; something that we need to avoid.
 
Additionally, there are partitioning schemes that can be employed which do not require moving of existing data.  Those involve partitioning on new data ranges for example (new date ranges is a common one).
 
We weighed other ways of partitioning existing data that were rejected internally after some debate.  I’d like to hear suggestions on how we can improve migration for you within the given architectural constraints.
 
 
 
 
[collapse]
From: TheMadDBA [mailto:bounce-TheMadDBA@community.progress.com]
Sent: Wednesday, May 13, 2015 1:32 PM
To: TU.OE.RDBMS@community.progress.com
Subject: RE: [Technical Users - OE RDBMS] Disconnect users during work partitionmanage split
 
Reply by TheMadDBA

The current state of partitioning in OpenEdge isn't the best for large databases/tables. Either you have to dump and reload (outages) or you just don't implement it for large databases.

Having to kick users out combined with the amount of time required to actually split partitions and the vast amounts of AI generated (with the resulting issues for hot spares) make splitting online an unrealistic option for most large installs.

Most of the people that can realistically implement partitioning (today) are not the ones who actually need it.

Still... It is a good starting point and I have hopes that PSC will continue making improvements on this and all of the online utilities.

Stop receiving emails on this subject.

Flag this post as spam/abuse.

[/collapse]

Posted by Dapeng Wu on 13-May-2015 12:54

Posted by Dapeng Wu on 13-May-2015 12:58

We understand all the difficulties that customers with large db are facing during TP migration. While we’ll continue working on improvements in this area, customers with smaller databases should actually start to consider TP, even if they don’t need it now --- the data can only grow bigger!

Dapeng

Posted by TheMadDBA on 13-May-2015 13:08

Rich: I understand completely why you (PSC) did what you did for the first release. Given how long a few of us had wanted partitioning it is indeed appreciated.  I just hope that in the next few years we look back on the first release of partitioning like we do on ancient versions of idxbuild,online schema changes,etc. now.

If the partition split could happen with users online and make it multi-threaded that would help quite a bit... especially if the AI apply was multi-threaded as well. Something along the lines of locking a record or even all of the records in a block and then release them when moved. So we could realistically split the partitions over time without having to take an extended outage or series of outages.

The other option would be to come up with a multi-threaded offline split utility that would split the partitions much faster than the time it takes to dump, reload and idxbuild. Even if it didn't rebuild the indexes we could manage that by rebuilding per partition at least (with all the cool idxbuild parameters we have now).

Asking a 24/7 or even a 24/6 shop for either a large outage or a number of smaller outages isn't easy especially since most of what we (customers in general) are getting out of TP version 1 is ease of maintenance and not drastic performance improvements.

Posted by Richard Banville on 13-May-2015 13:12

 
Thanks for the feedback.  Much appreciated.
 
 
[collapse]
From: TheMadDBA [mailto:bounce-TheMadDBA@community.progress.com]
Sent: Wednesday, May 13, 2015 2:09 PM
To: TU.OE.RDBMS@community.progress.com
Subject: RE: [Technical Users - OE RDBMS] Disconnect users during work partitionmanage split
 
Reply by TheMadDBA

Rich: I understand completely why you (PSC) did what you did for the first release. Given how long a few of us had wanted partitioning it is indeed appreciated.  I just hope that in the next few years we look back on the first release of partitioning like we do on ancient versions of idxbuild,online schema changes,etc. now.

If the partition split could happen with users online and make it multi-threaded that would help quite a bit... especially if the AI apply was multi-threaded as well. Something along the lines of locking a record or even all of the records in a block and then release them when moved. So we could realistically split the partitions over time without having to take an extended outage or series of outages.

The other option would be to come up with a multi-threaded offline split utility that would split the partitions much faster than the time it takes to dump, reload and idxbuild. Even if it didn't rebuild the indexes we could manage that by rebuilding per partition at least (with all the cool idxbuild parameters we have now).

Asking a 24/7 or even a 24/6 shop for either a large outage or a number of smaller outages isn't easy especially since most of what we (customers in general) are getting out of TP version 1 is ease of maintenance and not drastic performance improvements.

Stop receiving emails on this subject.

Flag this post as spam/abuse.

[/collapse]

Posted by Valeriy Bashkatov on 10-Jun-2015 02:40

Just about the need for a multithreading. Here is a small part of my experiments:

$ time proutil largedb3 -C partitionmanage split table theTable composite initial
OpenEdge Release 11.5.1 as of Tue Mar 31 19:00:25 EDT 2015

BEGIN: Split Operation For Table theTable (17384)

             Source Partition initial[0]
             Target Partition IntField-1[3]
             Target Partition IntField-2[4]
             Target Partition IntField-3[5]

Index IntField has been identified as the scanning index (selected).

A non-unique index has been selected as the useindex index.
Additional locking is required with the use of this index IntField.

Number of Records per Transaction (recs): 100

Do you want to continue (y/n)?
y
1000000 records processed. (15165)
2000000 records processed. (15165)
3000000 records processed. (15165)
4000000 records processed. (15165)
5000000 records processed. (15165)
6000000 records processed. (15165)
7000000 records processed. (15165)
8000000 records processed. (15165)
       Target partition: IntField-1[3], records moved: 8184959.
9000000 records processed. (15165)
10000000 records processed. (15165)
11000000 records processed. (15165)
12000000 records processed. (15165)
13000000 records processed. (15165)
14000000 records processed. (15165)
15000000 records processed. (15165)
16000000 records processed. (15165)
       Target partition: IntField-2[4], records moved: 8183704.
17000000 records processed. (15165)
18000000 records processed. (15165)
19000000 records processed. (15165)
20000000 records processed. (15165)
21000000 records processed. (15165)
22000000 records processed. (15165)
23000000 records processed. (15165)
24000000 records processed. (15165)
       Target partition: IntField-3[5], records moved: 8181016.
       Source partition: initial[0], still contains records.
       Total records processed: 24549679.

END: Split Operation For Table theTable[0]
Split Operation finished successfully. (17359)

real    1654m33.155s
user    376m48.248s
sys     16m30.662s

This thread is closed