Look for a SQL Term

Posted on 2011-09-26
Medium Priority
Last Modified: 2012-08-14
I have several databases for the same program which I am working to use an ETL program to move data over into a brand new database.

Several of the entities, Items, Projects, Employees, have duplicate IDs or Keys and we are going to have to Recode, change, update etc so those so there are no duplicates in the final database.  Many to One.

My questoin is simply, what is this process called?  A co-worker is calling this normalization, but as I read about that term, that doesn't seem like the right term.  Does anyone know what this procedure is called??

Thanks so much!
Question by:red_75116

Expert Comment

ID: 36707642
SQL Join to Table
LVL 39

Expert Comment

by:Pratima Pharande
ID: 36707654
this is called normalization , there are 3 Normal forms

For Example = 1NF

Eliminate duplicative columns from the same table.
Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).

see for more details

Expert Comment

ID: 36707767
If you are going to remove the duplicates by changing the data itself and not altering tables. you can call this "Data Cleaning" if you are altering tables then its Normalisation
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!


Expert Comment

ID: 36708227
Please assess if you need a normalized or denormalized database. Ideally, a denormalized database is much faster to query but could take up a lot of space and difficult to update.

Expert Comment

ID: 36708464
This is some thing called as data integration.

Author Comment

ID: 36709301
Let me try to clarify my question

If I have two databases with tables called customers and both tables have key fields like ID where they simply increment like

ID    |  Name
001 |  ABC Company
002 |  ACME Brick

and in the other I have
ID    |  Name
001 |  Checkers
002 |  Home Depot

And I am doing an ETL project and the customers from both of these databases are going to get copied and maintained in another database.  I  cant have all these records in the new database cause the Key field won't be unique so we have to  (clean, normalize, align, whatever the word is) and rekey some the data to a prevent any duplicates.  One simple solution would be to do this.

ID      |  Name
A001 |  ABC Company
A002 |  ACME Brick

ID       |  Name
B001 |  Checkers
B002 |  Home Depot

I am asking what the task to rekey this data is called.  (clean, normalize, align, whatever the word is)


LVL 50

Assisted Solution

Lowfatspread earned 2000 total points
ID: 36709307
i believe you are looking for the process of applying surrogate keys to your warehouse/mart ...

this is the process where you supply a new set of keys to the data so that within the database you have independence from the source systems and can support multiple source systems , some using similar (external) keys... which will be resolved by the ETL processing into the relevant new internal key for the warehouse...

it is up to your business logic in the etl stage to determine if you can aggregate the different source objects into existing objects already processed from other systems into the database.

you are conforming/standardising the data values to be held

normalisation is the process of designing the structures to hold/represent the data/relationships
LVL 50

Expert Comment

ID: 36709319
its often called/performed as part of the DATA CLEANSE phase of the ETL processing
LVL 50

Expert Comment

ID: 36709328

Author Comment

ID: 36709334

You answer sounds the closest so far, but there are existing application databases so I dont have the ability to add new keys, but am limited to having to change them.  Kinda ugly I know.

This isnt a warehouse or mart, but the concept is reallly the same.  The databases are different divisions from an ERP program and then getting merged into a web based app every 2-3 hours.

Normalization doesn't seem like thr right word, but these might be.
LVL 50

Accepted Solution

Lowfatspread earned 2000 total points
ID: 36709536
not ugly ,standard procedure to ensure data independence in a "warehouse".

you don't change the source systems , its the new system which maintains the links between old and new ...

whether thats in the staging area or the main database is up to you.  you probably do maintain the existing source keys as a source reference (source system/source internal reference) on the data.

The design of the web based database , ideally shouldn't be just a copy of one of the source systems but rather a specific design for its requirements ... it presumably will have its own lifecycle independant of the source systems..

you haven't indicated if its supposed to be historical (a principle driver in the requirement to implement surrogate keys) or some sort of snapshot of the data (surrogate keys are required to ensure data integrity across the source systems)

i am assuming that you are mapping (conforming) the actual attributes of each source system into a set of "common"  codes for the new web app?

Normalisation definitly isn't the correct term ... (if i;m understanding you correctly)

Author Closing Comment

ID: 36709664

Featured Post


Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

One of the most important things in an application is the query performance. This article intends to give you good tips to improve the performance of your queries.
MSSQL DB-maintenance also needs implementation of multiple activities. However, unprecedented errors can hamper the database management. In that case, deploying Stellar SQL Database Toolkit ensures fast and accurate database and backup repair as wel…
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties
Suggested Courses

850 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question