Link to home
Start Free TrialLog in
Avatar of schmir1
schmir1Flag for United States of America

asked on

Query to find duplicate key values between two tables

I'm copying data from my primary table to my archive table as part of an automated process.  This was working but now the archive tables has got a bunch of key field values that duplicate what is in my main table.  I need a query to compare these tables and find key values that are duplicated between tables.  Then to change the archive table to an unused number.  I can manually manually fix the problem by adding a bunch of 1's to the number.  For example:
  Changing  Archive_Issued_Traveler_Steps.Issued_Traveler_Steps_ID from 776182 to 1111776182
I'm looking for a query that could do that.
This is a simple query of the archive table and the field that I'm interested in:
SELECT Archive_Issued_Traveler_Steps.Issued_Traveler_Steps_ID
FROM Archive_Issued_Traveler_Steps;

This is the main table and the field to compare to the archive table
SELECT Issued_Traveler_Steps.Issued_Traveler_Steps_ID
FROM Issued_Traveler_Steps;

Open in new window

Avatar of sameer2010
sameer2010
Flag of India image

Use this code..and substitute the column names as applicable
select case 
when t1.Issued_Traveler_Steps_ID is null then t2.Issued_Traveler_Steps_ID
else
111*power(10,len(t2.Issued_Traveler_Steps_ID))+t2.Issued_Traveler_Steps_ID
end as Issued_Traveler_Steps_ID1,
t2.column2, t2.column3
from Issued_Traveler_Steps as t2
left outer join Archive_Issued_Traveler_Steps as t1
on t2.Issued_Traveler_Steps_ID = t1.Issued_Traveler_Steps_ID 

Open in new window

I have assumed 3 1s to be added (111) and the type of ID column as integer. If it is string, you can replace the logic to append 1s with concatenation of 111 string.
Avatar of js-profi
js-profi

Better change primary key of archive table to a sequence number (automatically). The old primary key should be turned to a normal (indexed) data type.
Avatar of Sharath S
>>  I need a query to compare these tables and find key values that are duplicated between tables.

If I understand correctly, you will be having duplicate values in archive table only. i.e. for a value 1234, there is one record in your main table and duplicate records in your archive table. How exactly you want your output? To retrieve duplicate key values from archive table, you can try like this.

SELECT Issued_Traveler_Steps_ID
FROM Archive_Issued_Traveler_Steps
GROUP BY Issued_Traveler_Steps_ID
HAVING COUNT(Issued_Traveler_Steps_ID) > 1;

Once you get the duplicate Issued_Traveler_Steps_ID values, you can do whatever you want.

Let us know what exactly you are looking for.
Avatar of schmir1

ASKER

sameer2010:
  Your query looks interesting.  I didn't know there were Case statements in queries.  Anyhow, I don't understand it very well.  Can you help me.  

Is t2 the table to compare to t1 with t2 being the table to change?   Also, I only want to change column 1?  I want to change Archive_Traveler_Steps table.  Does the query below look right?  Also, how would I modify it so that it would show the dups without changes anything (for testing).

Thanks
select case  
when Issued_Traveler_Steps.Issued_Traveler_Steps_ID is null then Archive_Issued_Traveler_Steps.Issued_Traveler_Steps_ID 
else 
111*power(10,len(Archive_Issued_Traveler_Steps.Issued_Traveler_Steps_ID))+Archive_Issued_Traveler_Steps.Issued_Traveler_Steps_ID 
end as Issued_Traveler_Steps_ID1
from Issued_Traveler_Steps as Archive_Issued_Traveler_Steps 
left outer join Archive_Issued_Traveler_Steps as Issued_Traveler_Steps 
on Archive_Issued_Traveler_Steps.Issued_Traveler_Steps_ID = Issued_Traveler_Steps.Issued_Traveler_Steps_ID 

Open in new window

Hi,

You need not change anything in the query that I wrote above. I assumed Issued_Traveler_Steps as T2 and Archive_Issued_Traveler_Steps as T1. T2 being the table from where the data needs to be inserted.
This is just a SELECT query, so it would show you the data without any modifications. If you want to actually modify the table, then you would have to user INSERT INTO TABLE_NAME(above query).
Avatar of schmir1

ASKER

I got the following error when I ran your query:

Msg 208, Level 16, State 1, Line 1
Invalid object name 'Issued_Traveler_Steps'.
select case  
when t1.Issued_Traveler_Steps_ID is null then t2.Issued_Traveler_Steps_ID 
else 
111*power(10,len(t2.Issued_Traveler_Steps_ID))+t2.Issued_Traveler_Steps_ID 
end as Issued_Traveler_Steps_ID1, 
t2.Step_Num, t2.Build_Step 
from Issued_Traveler_Steps as t2 
left outer join Archive_Issued_Traveler_Steps as t1 
on t2.Issued_Traveler_Steps_ID = t1.Issued_Traveler_Steps_ID 

Open in new window

Is Issued_Traveler_Steps the correct table name?
Avatar of schmir1

ASKER

Here are the two names copied from the server.  I am using Management Studio to run the query.  The tables have "dbo." in front of them but I don't think that has to be added.

Issued_Traveler_Steps
Archive_Issued_Traveler_Steps
Are you running the query in correct database?

Try like this.
use YourDataBaseName
select case  
when t1.Issued_Traveler_Steps_ID is null then t2.Issued_Traveler_Steps_ID 
else 
111*power(10,len(t2.Issued_Traveler_Steps_ID))+t2.Issued_Traveler_Steps_ID 
end as Issued_Traveler_Steps_ID1, 
t2.Step_Num, t2.Build_Step 
from Issued_Traveler_Steps as t2 
left outer join Archive_Issued_Traveler_Steps as t1 
on t2.Issued_Traveler_Steps_ID = t1.Issued_Traveler_Steps_ID

Open in new window

Avatar of schmir1

ASKER

My bad.  I'm not that familiar with SQL Server stuff.  

I got the output now but it looks like it is listing all the records with Issued_Traveler_Steps_ID that occur in Issued_Traveler_Steps table but do not occur in Archive_Issued_Traveler_Steps.?  I need it to list all the records from Archive_Issued_Traveler_Steps that have Issued_Traveler_Steps_ID values that are identicial in both Issued_Traveler_Steps and Archive_Issued_Traveler_Steps
are you looking for this?
use YourDataBaseName
select case  
when t1.Issued_Traveler_Steps_ID is null then t2.Issued_Traveler_Steps_ID 
else 
111*power(10,len(t2.Issued_Traveler_Steps_ID))+t2.Issued_Traveler_Steps_ID 
end as Issued_Traveler_Steps_ID1, 
t2.Step_Num, t2.Build_Step 
from Issued_Traveler_Steps as t2 
inner join Archive_Issued_Traveler_Steps as t1 
on t2.Issued_Traveler_Steps_ID = t1.Issued_Traveler_Steps_ID

Open in new window

Avatar of schmir1

ASKER

I still wasn't getting the right values so I changed the query to the following.  It correctly labels the records that "Match" and that "No Match".  So it's getting closer.
select 
case  
  when t1.Issued_Traveler_Steps_ID = t2.Issued_Traveler_Steps_ID
  then 'match' 
  else 'no match'
end as Issued_Traveler_Steps_ID1, 
t2.Issued_Traveler_Steps_ID, t2.Issued_Traveler_ID, t2.Step_Num, t2.Build_Step from Archive_Issued_Traveler_Steps as t2 
left outer join Issued_Traveler_Steps as t1 
on t2.Issued_Traveler_Steps_ID = t1.Issued_Traveler_Steps_ID 

Open in new window

>> I still wasn't getting the right values so I changed the query to the following.  It correctly labels the records that "Match" and that "No Match".  So it's getting closer.

Do you still looking for assistance?
Avatar of schmir1

ASKER

Yes.  I would like to be able to list only those that match.  Then when that looks right I would like to change the values.
Avatar of schmir1

ASKER

Also, when I had this line in I was getting overflow errors:

111*power(10,len(t2.Issued_Traveler_Steps_ID))+t2.Issued_Traveler_Steps_ID  
What is the definition of Issued_Traveler_Steps_ID?
Remove multiple occurrences of 111 to 11 and see if that helps. But if this is happening, it means your idea of prefixing all the IDs with 111 will not work because IDs are already such in length that more number could possibly not be prefixed.
Avatar of schmir1

ASKER

It's a 4 byte integer.

If we could get the first part working (listing only the matching records) for now that would be great.
This should work. It would return all the matching IDs only.
select 
t2.Issued_Traveler_Steps_ID, t2.Issued_Traveler_ID, t2.Step_Num, t2.Build_Step from Archive_Issued_Traveler_Steps as t2 
inner join Issued_Traveler_Steps as t1 
on t2.Issued_Traveler_Steps_ID = t1.Issued_Traveler_Steps_ID 

Open in new window

Avatar of schmir1

ASKER

That's a winner.  Now how would I update them?  The highest number is 776168 so adding 111 should be OK?
ASKER CERTIFIED SOLUTION
Avatar of Sharath S
Sharath S
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
To alter the column, try like this.

ALTER  Archive_Issued_Traveler_Steps ALTER  Issued_Traveler_Steps_ID BIGINT

http://technet.microsoft.com/en-us/library/ms190273.aspx
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of schmir1

ASKER

I used Sharath's query and it worked fine.  Thanks to both of you for your help
Avatar of schmir1

ASKER

 Excellent answers.