[2 days left] What’s wrong with your cloud strategy? Learn why multicloud solutions matter with Nimble Storage.Register Now


Updating Partition Key Columns

Posted on 2006-10-30
Medium Priority
Last Modified: 2008-01-09
I am designing a database in IBM DB2 UDB version 8 for z/OS and OS/390. The database will have several very large tables that will require partitioning. We intend to use a single SMALLINT column, PARTN_ID (Partition ID), as the partition key column on each partitioned table. The column value would determine the partition to which INSERTed rows would be put.

Now, for my questions.

1. If we UPDATE column PARTN_ID for existing rows via static SQL, will DB2 automatically redistribute the updated rows to the partitions indicated by the new values in PARTN_ID, or will the redistribution only happen at the next REORG?

2. Will a RUNSTATS be required or merely advisable after the redistribution or REORG?

3. Will plans need to be rebound after the redistirbution or REORG?
Question by:dlmedici
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
LVL 46

Assisted Solution

by:Kent Olsen
Kent Olsen earned 300 total points
ID: 17837080

1)  The DBMS (in this case DB2) will locate data according to data partitioning key.  If the new key indicates that the row should exist in another partition, DB2 will move it for you.

2)  Not sure what you mean by redistribution.  If you mean will the statistics be updated simply because a row was moved to another partition, the answer is "no".

3) No.

Good Luck,

LVL 13

Assisted Solution

ghp7000 earned 600 total points
ID: 17837332
yes, thats exactly correct. If you update the partitioning key, and the partitioning map says they key belongs in another partition, it will be moved there, but the runstats is now out of balance, because the row count is now different among the partitions. This may or may not matter, especially if you are frequently updating they keys-what matters is the current distribution among the partitions. When the row distribution gets out of balance, meaning the rows are not more or less evenly distributed among the partitions (for very large tables), that is when you need to to redistribute the partitions.
All this is possible, but it begs the question of why would you choose a column as the partitioning key if you know in advance that you are going to frequently update it? Not only will this create loads of additional work for you, but from a design point of view its a poor choice. The key should be chosen for its ability to satisfy sql where clause and group by queries and by its very nature should not be updateable, even though DB2 permts it. By frequently updating they key, you will run into poor access plans and loads of broadcasting as well as mkaing it next to impossible to co locate the users with their correct partitions, which will require additional network resources for the FCM blocks. So, I think you would be advised to rethink your strategy, your heading into a deep abyss from which there is no return.

Author Comment

ID: 17837698
Ok, please confirm my point-by-point understanding below...

1. Updating the partition key will cause the rows to redistribute (i.e., move) immediately.
2. A RUNSTATS would be required because the data is now located in different partitions.
3. No REORG is required.

Does not #2 above imply that a rebind of plans/packages would also be required, to take into consideration the new stats?

Now, to ghp7000's comments. We do not anticipate frequent updates of the partition key. However, it is possible that after some time we may discover that a client is now sending us more data than we anticipated, and that we would want to relocate his data into a an empty partition (assuming he originally shared a partition with other small clients), or spread his data over two or more parititions. In that case, the algorithm that populates PARTN_ID would be updated, static SQL UPDATEs of existing PARTN_ID columns for the client would be executed (thus moving the data), and RUNSTATS would be executed.

I agree that a partitioning key should satisfy well the SQL predicates, but we are also targeting overall performance. By creating a partition key that is flexible (via an algorithm) we can adjust performance for particular clients, without redefining partition ranges.

Does this make sense?
LVL 37

Accepted Solution

momi_sabag earned 600 total points
ID: 17840464

all the things you are saying are correct.
db2 will move a row between partitions when you update the partitioning key.
it is recommended to run runstats regulary in order to keep the statistics updated, but you should consider the beneift when doing so. if you only move 5% of the rows from one partition to another partition, then maybe you should not run runstats, especially if your partitions are very big, and runstats will consume alote of resources.
if the total number of rows in a partition (either the one you delete from or the one you insert to) does not change in more than 10% i would not run runstats especially for that (assuming you are running runstats on some regular basis so the new statistics of the tables would be updated eventually).

beside for the case that you describe here, it is a good thing to rebind plans after you run runstatns and reorgs so db2 will be able to re-evaluate the access path

for the specific scenario you described,
when you set the partitioning key the way you said you do get the maximum flexability, and i understand why you would like to move some clients between partitions now and then, but maybe you should take another approache
in db2 for zos you can alter the partitioning key (does not matter if you used index based or table based partitioning) and after you do that run a reorg tablespace that will rebalance the rows.
this is a better way to do the update if you are about to update many rows, because when you update many rows using a sql statement you will cause massive logging, you will probably hit the maxium lock allowed and lock the entire partition, and it's a bit complex to implement commit logic into this massive update. instead if you rebalance using the reorg utility, it would go faster,
you would be able to recluster the data, reallocate the free space, and run runstats while running reorg.

so for the bottom line,
if you think you will be moving a large amount of rows between partitions, i would recommend choosing a different partitioning column (for example client code), and when you want to move the rows, instead of updating the partitioning key, just alter the partitioning key and run a reorg

momi sabag

Featured Post

Get free NFR key for Veeam Availability Suite 9.5

Veeam is happy to provide a free NFR license (1 year, 2 sockets) to all certified IT Pros. The license allows for the non-production use of Veeam Availability Suite v9.5 in your home lab, without any feature limitations. It works for both VMware and Hyper-V environments

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Recursive SQL in UDB/LUW (you can use 'recursive' and 'SQL' in the same sentence) A growing number of database queries lend themselves to recursive solutions.  It's not always easy to spot when recursion is called for, especially for people una…
Recursive SQL in UDB/LUW (it really isn't that hard to do) Recursive SQL is most often used to convert columns to rows or rows to columns.  A previous article described the process of converting rows to columns.  This article will build off of th…
This course is ideal for IT System Administrators working with VMware vSphere and its associated products in their company infrastructure. This course teaches you how to install and maintain this virtualization technology to store data, prevent vuln…
Visualize your data even better in Access queries. Given a date and a value, this lesson shows how to compare that value with the previous value, calculate the difference, and display a circle if the value is the same, an up triangle if it increased…
Suggested Courses

656 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question