Hilaire
asked on
Changing Primary key on a big table
Hi experts, I've been using mostly SQL Server in the last 5 years, so my Oracle skills are a little outdated. I need your advice for the following :
On a datawarehouse Oracle 9i DB, a table has been wrongly designed and I'm asked to change the primary key to avoid dupes.
Let's call it STAR12, and assume the current primary key is on
(ID_MONTH int, ID_DWH int) where ID_DWH is a sequence number that resets for each new ID_MONTH
I had to alter the table to add a new field ID_GRP_RISK
And I need to
- populate the new field (I've done this part)
- delete the records where the new field is null (done it too)
- set a not null constraint on the new field
- drop the current primary key
- delete the duplicate records
- create the new primary key on (ID_MONTH, ID_GRP_RISK)
The requirements :
I've been told that applying the changes to the historical data is a requirement. Dropping the table and running the loading script to populate a brand new table for each and every month is not an option.
My current concerns :
I was wondering if it would not be better to keep the current index and just create a new unique index on (ID_MONTH, ID_GRP_RISK), and if you experts would have better ideas or advices to handle this situation.
Also I was wondering if it would not be more effective to
- create a non-unique index first, to speed up the deleting of the dupes (records that have the same ID_MONTH, ID_GRP_RISK)
- delete the dupes
- make it the primary key afterwards (is it even possible ?)
The table has about 1 million records per ID_MONTH, currently 20 months loaded.
To avoid filling the temp tablespaces and / or swapping forever, I loop on ID_Month with a cursor to put intermediary COMMITs after each month is processed.
Any ideas/sample code welcome (more efficient syntax), in particular
I will create another topic for the "delete dupes" part if need be.
Thanks for reading
Hilaire
On a datawarehouse Oracle 9i DB, a table has been wrongly designed and I'm asked to change the primary key to avoid dupes.
Let's call it STAR12, and assume the current primary key is on
(ID_MONTH int, ID_DWH int) where ID_DWH is a sequence number that resets for each new ID_MONTH
I had to alter the table to add a new field ID_GRP_RISK
And I need to
- populate the new field (I've done this part)
- delete the records where the new field is null (done it too)
- set a not null constraint on the new field
- drop the current primary key
- delete the duplicate records
- create the new primary key on (ID_MONTH, ID_GRP_RISK)
The requirements :
I've been told that applying the changes to the historical data is a requirement. Dropping the table and running the loading script to populate a brand new table for each and every month is not an option.
My current concerns :
I was wondering if it would not be better to keep the current index and just create a new unique index on (ID_MONTH, ID_GRP_RISK), and if you experts would have better ideas or advices to handle this situation.
Also I was wondering if it would not be more effective to
- create a non-unique index first, to speed up the deleting of the dupes (records that have the same ID_MONTH, ID_GRP_RISK)
- delete the dupes
- make it the primary key afterwards (is it even possible ?)
The table has about 1 million records per ID_MONTH, currently 20 months loaded.
To avoid filling the temp tablespaces and / or swapping forever, I loop on ID_Month with a cursor to put intermediary COMMITs after each month is processed.
Any ideas/sample code welcome (more efficient syntax), in particular
I will create another topic for the "delete dupes" part if need be.
Thanks for reading
Hilaire
ASKER
thanks for your comment sathyagiri
would you mind setting verbose mode on (no offence meant, I really appreciate your comment above ;-) ) and giving me a few tips on how to remove dupes using BULK COLLECT FORALL, so that I get the general idea ?
I'm going through the BULK COLLECT online reference just now, but I'm not familiar with it at all, so I'd be glad to get a short explaination so that Idon't miss the point.
Thanks again
Hilaire
would you mind setting verbose mode on (no offence meant, I really appreciate your comment above ;-) ) and giving me a few tips on how to remove dupes using BULK COLLECT FORALL, so that I get the general idea ?
I'm going through the BULK COLLECT online reference just now, but I'm not familiar with it at all, so I'd be glad to get a short explaination so that Idon't miss the point.
Thanks again
Hilaire
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks a million.
I really appreciate your help.
Regards
Hilaire
I really appreciate your help.
Regards
Hilaire
>>And I need to<<
- populate the new field (I've done this part)
OK
- delete the records where the new field is null (done it too)
OK
- delete the duplicate records
Needs to be done first
delete from star12
where rowid not in
(select row_id from
(select id_month, id_grp_risk, max(rowid) row_id
from star12
group by id_month, id_grp_risk));
- create the new primary key on (ID_MONTH, ID_GRP_RISK)
Should be done next
create unique index <index_name> on star12(id_month, id_grp_risk); => use keyword local if table is partitioned
- set a not null constraint on the new field
alter table star12 add constraint <same as index_name>
primary key(id_month, id_grp_risk) using index;
- drop the current primary key
alter table star12 drop constraint <old_constraint_name>;
drop index <old_index_name>;
- populate the new field (I've done this part)
OK
- delete the records where the new field is null (done it too)
OK
- delete the duplicate records
Needs to be done first
delete from star12
where rowid not in
(select row_id from
(select id_month, id_grp_risk, max(rowid) row_id
from star12
group by id_month, id_grp_risk));
- create the new primary key on (ID_MONTH, ID_GRP_RISK)
Should be done next
create unique index <index_name> on star12(id_month, id_grp_risk); => use keyword local if table is partitioned
- set a not null constraint on the new field
alter table star12 add constraint <same as index_name>
primary key(id_month, id_grp_risk) using index;
- drop the current primary key
alter table star12 drop constraint <old_constraint_name>;
drop index <old_index_name>;
Sorry, I shouldn't have gone to lunch leaving the question open. I see you already have the solution. Good job, guys:-)
1.For deleting your duplicates, think of using BULK COLLECT using FORALL.
3.Then drop the nojn-unique index and create your primary key