generating a key for a table

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

ASKER

ok, thanks
lets say i have rows with id

1
2
3
4
and so on

if row 3 is deleted then will id 4 become id 3 ?

thx.

>if row 3 is deleted then will id 4 become id 3 ?
Nope. These numbers are generated once the row is initially inserted, and never to be reused, so in your example if 3 is deleted 4 is still 4, and does not 'move' to 3.

>> There can be two or more rows which can have exactly same column values. <<

If all the columns are exactly the same, and if there's no time element (such as datetime) in the data, why not just prevent the duplicate row(s) from INSERTing instead? Why would you need repeated copy(ies) of exactly the same data in a table??

ASKER

>>>Why would you need repeated copy(ies) of exactly the same data in a table??
well that is a good point :-)
i am going to ask my manager, the problem here is we dont generate the data directly, we get the data from another team and process it.

thx.

It still is probably a good idea to add an identity column, but don't let that overshadow making the correct data decision anyway :-).

If you can, just delete the excess duplicate rows, less overhead all around.

>we get the data from another team and process it.
Sounds like you have a need to initially 'stage' the data into separate tables, where validations such as uniqueness, numbers are numbers, dates are dates, no one was born yesterday, products don't cost a brazillion dollars, Cleveland is not in Indiana, etc. are performed, before the data is ultimately inserted into your production environments.

ASKER

yes, agree
Is there a way we can add a trigger (or some db utility) which checks before inserting every row and if a duplicate row already exists it doesnt add the new row.
Can something of that kind be possible?

thanks.

Yes that's possible, and there are lots of threads out there on how to do this, as long as you have all ten or so columns in the WHERE clause.

How many rows are we inserting here? If large, a merge would be a better approach, either in T-SQL or in SSIS. Especially if the business wants some kind of performance metrics on how many rows were inserted - updated - rejected - whatever.

ASKER

>>How many rows are we inserting here?
Between 600000 and 800000

thx

A trigger would be more overhead than an EXISTS check in the INSERT statement, although you will still need the trigger as a "failsafe" mechanism.

You also want to include DISTINCT in the SELECT list on the INSERT to prevent duplicates from within the INSERT itself. That will add some overhead to the INSERT, because the rows will have to be sorted, but that can't be avoided.

ASKER

thanks.
I tried doing
ALTER TABLE YourTABLE
ADD id int identity(1,1) PRIMARY KEY

first time i am inserting 100,000 rows, i see id values from 1 to 100,000

then i do
DELETE from TABLE;

Then when i populate 100,000 rows again i see
id starting from 100001, 100002, ect

Curious why doesnt it start from zero..

thanks.

ASKER

intrestingly when i do
truncate table MYTABLE (instead of DELETE statement)
and
insert the data again the id starts from 0 and not 100001.
But i dont have previlege to do truncate on the prod database.

Any idea how I can make sure that id always start from 0?

Thanks.

ASKER CERTIFIED SOLUTION

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial