Link to home
Start Free TrialLog in
Avatar of pgreetis
pgreetis

asked on

SQL SELECT statement that ignores duplicate records

Okay so I have search everywhere and either don't get it or am not using the correct terminology.  I am trying to write a SELECT statement to select all of the records in a database, except for the duplicates.  The table have columns such as firstname, lastname, SSN, order number.  So the table with data would look like this:

firstname  lastname  SSN                   order number
-------------------------------------------------------------
bob           smith       111-22-3333        125675
jerry         jones       333-22-4444         789439
bob           smith       111-22-3333         848493
kari           taylor       123-45-6789         435435
bob           smith       111-22-3333         256999
kari           taylor       123-45-6789         443435

I want to compare the records off of the SSN column and just get a recordset like the following:

firstname  lastname  SSN        
----------------------------------------
bob           smith       111-22-3333
jerry         jones       333-22-4444
kari           taylor       123-45-6789


Can someone please help me with this?  I have tried several different ways according to other websites and I either get all the records returned or only one.  Thanks much!!
ASKER CERTIFIED SOLUTION
Avatar of Aneesh
Aneesh
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Method 2

SELECT T1.firstName, T1.lastname,T1.SSN,T1.OrderNumber
FROM urTable T1
INNER JOIN
(SELECT DISTINCT SSN FROM urTable) T2
ON T1.SSN = T2.SSN
Avatar of pgreetis
pgreetis

ASKER

Thanks Aneesh R!, but when I tried that it just gives me the total number of records.  Any other ideas?
ok ,i got it, U ned to replace the ssn in the inner query using OrderNumber, hoping that it is unique
So tried this:

SELECT firstName, lastname,SSN,OrderNumber
FROM urTable
WHERE SSN in
(SELECT DISTINCT OrderNUmber FROM urTable)

AND

SELECT firstName, lastname,SSN,OrderNumber
FROM urTable
WHERE OrderNUmber in
(SELECT DISTINCT OrderNumber FROM urTable)


And got 0 records returned.  I feel like such an idiot because I know this should be so easy...


SELECT T1.firstName, T1.lastname,T1.SSN,T1.OrderNumber
FROM urTable T1
INNER JOIN
(SELECT MAX(OrderNumber)  OrderNumber FROM urTable GROUP BY SSN) T2
ON T1.OrderNumber = T2.OrderNumber
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Okay one problem.  The excel file I am running this SQL SELECT statement against has a column that has a space in the name (ie, Customer SNN), so if I set the statement like so:

SELECT T1.firstName, T1.lastname,T1.'Customer SSN',T1.OrderNumber
FROM urTable T1
INNER JOIN
(SELECT MAX(OrderNumber)  OrderNumber FROM urTable GROUP BY 'Customer SSN') T2
ON T1.OrderNumber = T2.OrderNumber

OR

SELECT T1.'firstName', T1.'lastname',T1.'Customer SSN',T1.'OrderNumber'
FROM urTable T1
INNER JOIN
(SELECT MAX('OrderNumber')  'OrderNumber' FROM urTable GROUP BY 'Customer SSN') T2
ON T1.'OrderNumber' = T2.'OrderNumber'

I get an error message.  This is in VB6 by the way.

ZeonFlash,

Of course this will work, but if you wanna select the records based on the Ordernumber, i think it wont work
To HuyBD and ZeonFlash,

When I try those ways I only get one record returned.  Thanks those.
Put a square bracket instead of Quotes

SELECT T1.firstName, T1.lastname,T1.'Customer SSN',T1.OrderNumber
FROM urTable T1
INNER JOIN
(SELECT MAX(OrderNumber)  OrderNumber FROM urTable GROUP BY [Customer SSN]) T2
ON T1.OrderNumber = T2.OrderNumber
pgreetis,
> When I try those ways I only get one record returned.  Thanks those.

HuyBD's solution will work

SELECT firstName, lastName, SSN,MAX(OrderNumber)OrderNumber
FROM urTable
GROUP BY firstName, lastName, SSN
Both ways are compaining about a Syntax Error (missing operator) in query expression

For the one from HuyBD it says 'MAX(OrderNumber)OrderNumber'

The one you showed above said the same thing about this part: MAX(OrderNumber)  OrderNumber
declare @tab table (firstName varchar(32),lastname varchar(33), SSN varchar(32), OrderNumber int)

Insert into @tab
select 'bob',           'smith',       '111-22-3333',        125675

Insert into @tab
select 'jerry' ,        'jones'   ,    '333-22-4444'  ,       789439
Insert into @tab
select 'bob',           'smith',       '111-22-3333',         848493

SELECT firstName, lastName, SSN,MAX(OrderNumber)OrderNumber
FROM @tab
GROUP BY firstName, lastName, SSN
The only problem with that solution is that the excel file will never be the same other than the column names and data types.  I will not be able to run the above Declare and Insert every time I want to work with the file.
i just given an example, to show that it is working
Gotcha!  Unfortunately that still doesn't solve my problem.  If I don't get a working solution together by the end of the day tomorrow I will award the points to Aneesh R
Okay so I don't know what happened between last night and this morning but I tried this:

SELECT DISTINCT([Customer SSN]) FROM urTable

just on a whim and it is magically working!!  I thought I had tried this but I guess not.  I am going to split the points as everyone did technically provide working solutions, but I am giving a larger share to Aneesh R as he put a lot into this.  Thanks again everyone!!