DATA ARCHIVING-DELETING RECORDS FROM MULTIPLE TABLES WITHOUT REF.INTEGRITY

Hi Experts


We have a huge production database which has 2000+ tables and the volume of records in few of the tables have grown to more than 5 mill +

We have been assigned the task of archiving records for the past years. The application is HUGE ERP system. The tables do not have ref integrity, and all possibility of having orphan records in the detail tables.

I am thinking of developing a utility,

A back up of the production Database

Definition- UI to define the module and related tables
------------------------------------------------------

a) which will allow the user to select the module ( Metadata of all available modules) and allow user to select the tables pertaining to the module and accept the sequence (to denote which is the main table)

b) The system should then check for available ref integrity and display the same for reference

c) If no ref. Integrity is found among tables( may not be required for few tables, being updated from other modules), the system should allow the user to select a table and list all columns. Check if ref. Integ is available for the selected columns, if No then create a fK temporarily for the purpose of deleting the records from the child tables

d) Allow user to select a table and select a column and accept a value ' STATUS FLAG' (.i.e. CLOSED,CANcELLED) a closed invoice, cancelled invoice
UI to accept criteria for deletion

UI - to accept the Data Range
------------------------------

d) The system should allow the user to select the module, and accept date (till which the records will be deleted), The system should generate dynamic sql for deleting records based on the value accepted earlier and the date

alternatively - The user should identify the SP, which is used to create a new record for the module (.i.e. Create Invoice), and allow yhe user to copy paste the SP, and the system can parse for all the tables used in the module, and allow the user to modify the list (some temp tables might be used)

The reason for the above planned approach is We do not know the application, and hence could not analyze and arrive at a relationship, and also due to the changes made inhouse to the system, will help in not making changes to to the delete scripts frequently


Does the above sound ok, I am aware of the tables being updated by the other modules, and have validated that the user should use his/her intelligence to seqneuce the modules for deletion

I would like to hear from you on possible solutions and approaches
SrirampriyaAsked:
Who is Participating?
 
SrirampriyaAuthor Commented:
I use Erwin. the database does not have ID keys, and that is good. The system is a custom built ERP system, and most of the tables do hold data.

as indicated earlier, I am also looking into accepting the SP for a module and parse the tables used (also validate using system sp to validate whether that table is referred by the sp. this list will be built earlier)

Can you please list down the steps that you feel might be necessary/critical, as I want to document the approach first before getting my hands dirty ( also I have lost touch with programming for some years now lol)

Is the experts forum so bored with these types of questions !!!
0
 
mrjoltcolaCommented:
Have you considered reverse engineering it with a commercial tool that can do foreign key inference? It is not perfect but might get you started.

I think the sound of your GUI utility sounds dangerous, but perhaps I do not understand. You are thinking of a wizard that will trust your users to determine foreign key relationships and take the responsibility of data deletes? I think it sounds dangerous.

I would definitely try to develop an ER diagram in a commercial quality tool first, and spend time with some of your subject matter expert users to define the foreign keys as much as possible.
0
 
SrirampriyaAuthor Commented:
Hi

Yes, have been trying to go through the ER diagram as well. Have roped in one guy for tele support to validate as much as he can. I have taken time and have clearly stated that there will be a lots and lots of testing (in the process of identifing reports to check if the delete has gone through well in a test server)

I would love to get some help from the experts on how to proceed, with some additional logic building and also suggest any available codes/scripts

Best Regards
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
mrjoltcolaCommented:
A tool, such as ER/Studio, can reverse engineer and INFER foreign keys, to suggest relationships where there are none, based on key naming. Now, if your schema has tons of "ID" keys, then it won't help much, but if you have more specifically named columns, it may help. I have used it on schemas of 100-200 tables and within a week produced moderately accurate ERDs. It will give you a place to start. What modelling tool are you using now?

With 2000 tables, you have a huge task before you. I really hope that the bulk of those 2000 tables are redundant data tables, and not core schema. This is the downside to not using explicit foreign keys, the metadata becomes lost in history, as it was never concrete. In these cases, the blunt truth is: You have human work before you that a tool cannot accomplish. Studying and mining the schema and data for relationships, and using your subject matter people (users, whatever) to find AND document as much as you can, building the ERD as you go.
0
 
mrjoltcolaCommented:
I am pretty sure ERwin reverse engineer step lets you select "Infer Relationships" or something similar. My memory gets mixed somethings between ERwin and ER/studio but I'm near 100% certain. Check the options in RE.

Regarding what steps are critical. Let me tell you, based on 15 yrs of Oracle work, I have never seen a case where a schema of your size could be automatically reverse engineered. Like it or not, what I do for customers is come into their shop, do an initial reverse engineer just to get the objects into an ER model. If there are explicit relationships, great, they come too. Many times there are not and I end up with a huge ER diagram with islands and ZERO lines. Then I sit with the customer, literally for days, and review tables, take their comments, using the modelling tool adding the comments as we go. Slowly, we organize the tables by sub-schema, subject matter, function, etc. and even start identifying some relationships. Once you do identify relationships, you may start physically adding them.

I guess I may not really understand what you mean by your sp approach. It seem we are missing each other a bit. To me, it sounds like you have a basic case of reverse engineering of a schema, which is what ERwin is made for. Not just the part where you point it at the datbase and click Go, I mean the weeks you spend analyzing and taking input from the customer.

Thats about all I can add. Good luck.

0
 
SrirampriyaAuthor Commented:
thanks for your feedback. Once I get this thing, will post the same for your review and comments. I dont want to close this question, just to see if someone else had attempted similar problem
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.