Solved

Look for a SQL Term

Posted on 2011-09-26
12
237 Views
Last Modified: 2012-08-14
I have several databases for the same program which I am working to use an ETL program to move data over into a brand new database.

Several of the entities, Items, Projects, Employees, have duplicate IDs or Keys and we are going to have to Recode, change, update etc so those so there are no duplicates in the final database.  Many to One.

My questoin is simply, what is this process called?  A co-worker is calling this normalization, but as I read about that term, that doesn't seem like the right term.  Does anyone know what this procedure is called??

Thanks so much!
0
Comment
Question by:red_75116
12 Comments
 
LVL 3

Expert Comment

by:_-W-_
ID: 36707642
SQL Join to Table
0
 
LVL 39

Expert Comment

by:Pratima Pharande
ID: 36707654
this is called normalization , there are 3 Normal forms

For Example = 1NF

Eliminate duplicative columns from the same table.
Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).

see for more details
http://databases.about.com/od/specificproducts/a/normalization.htm
0
 
LVL 2

Expert Comment

by:Sandeepratan
ID: 36707767
If you are going to remove the duplicates by changing the data itself and not altering tables. you can call this "Data Cleaning" if you are altering tables then its Normalisation
0
 
LVL 5

Expert Comment

by:GirardAndrew
ID: 36708227
Please assess if you need a normalized or denormalized database. Ideally, a denormalized database is much faster to query but could take up a lot of space and difficult to update.
0
 
LVL 8

Expert Comment

by:vinurajr
ID: 36708464
This is some thing called as data integration.
0
 

Author Comment

by:red_75116
ID: 36709301
Let me try to clarify my question

If I have two databases with tables called customers and both tables have key fields like ID where they simply increment like

ID    |  Name
001 |  ABC Company
002 |  ACME Brick

and in the other I have
ID    |  Name
001 |  Checkers
002 |  Home Depot

And I am doing an ETL project and the customers from both of these databases are going to get copied and maintained in another database.  I  cant have all these records in the new database cause the Key field won't be unique so we have to  (clean, normalize, align, whatever the word is) and rekey some the data to a prevent any duplicates.  One simple solution would be to do this.

ID      |  Name
A001 |  ABC Company
A002 |  ACME Brick

===================
ID       |  Name
B001 |  Checkers
B002 |  Home Depot


I am asking what the task to rekey this data is called.  (clean, normalize, align, whatever the word is)

Thanks

0
Control application downtime with dependency maps

Visualize the interdependencies between application components better with Applications Manager's automated application discovery and dependency mapping feature. Resolve performance issues faster by quickly isolating problematic components.

 
LVL 50

Assisted Solution

by:Lowfatspread
Lowfatspread earned 500 total points
ID: 36709307
i believe you are looking for the process of applying surrogate keys to your warehouse/mart ...

this is the process where you supply a new set of keys to the data so that within the database you have independence from the source systems and can support multiple source systems , some using similar (external) keys... which will be resolved by the ETL processing into the relevant new internal key for the warehouse...

it is up to your business logic in the etl stage to determine if you can aggregate the different source objects into existing objects already processed from other systems into the database.

you are conforming/standardising the data values to be held


normalisation is the process of designing the structures to hold/represent the data/relationships
0
 
LVL 50

Expert Comment

by:Lowfatspread
ID: 36709319
its often called/performed as part of the DATA CLEANSE phase of the ETL processing
0
 
LVL 50

Expert Comment

by:Lowfatspread
ID: 36709328
0
 

Author Comment

by:red_75116
ID: 36709334
Lowfatspread:,

You answer sounds the closest so far, but there are existing application databases so I dont have the ability to add new keys, but am limited to having to change them.  Kinda ugly I know.

This isnt a warehouse or mart, but the concept is reallly the same.  The databases are different divisions from an ERP program and then getting merged into a web based app every 2-3 hours.

Normalization doesn't seem like thr right word, but these might be.
0
 
LVL 50

Accepted Solution

by:
Lowfatspread earned 500 total points
ID: 36709536
not ugly ,standard procedure to ensure data independence in a "warehouse".

you don't change the source systems , its the new system which maintains the links between old and new ...

whether thats in the staging area or the main database is up to you.  you probably do maintain the existing source keys as a source reference (source system/source internal reference) on the data.

The design of the web based database , ideally shouldn't be just a copy of one of the source systems but rather a specific design for its requirements ... it presumably will have its own lifecycle independant of the source systems..

you haven't indicated if its supposed to be historical (a principle driver in the requirement to implement surrogate keys) or some sort of snapshot of the data (surrogate keys are required to ensure data integrity across the source systems)

i am assuming that you are mapping (conforming) the actual attributes of each source system into a set of "common"  codes for the new web app?

Normalisation definitly isn't the correct term ... (if i;m understanding you correctly)
0
 

Author Closing Comment

by:red_75116
ID: 36709664
thanks
0

Featured Post

Zoho SalesIQ

Hassle-free live chat software re-imagined for business growth. 2 users, always free.

Join & Write a Comment

This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now