Link to home
Start Free TrialLog in
Avatar of caylt
cayltFlag for United States of America

asked on

Weeding out duplicate values (using DISTINCT)

Environment: MS SQL Server 2000

Say I have a table like the following:

identity,userid,colour,flavour
1,1,blue,vanilla
2,1,blue,strawberry
3,2,red,chocolate
4,2,red,chocolate
5,3,black,licorice

'SELECT DISTINCT userid, colour, flavour' blocks the display of row 4 (since every element in that row was already returned with row 3), but not row 2 (since the flavour value differs).  How do I return only one row per userid value?

I haven't found any working examples of DISTINCT ORDER BY (which sounds like it would be useful).

Thanks in advance,

C
ASKER CERTIFIED SOLUTION
Avatar of rafrancisco
rafrancisco

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Brian Crowe
It sounds like is that you want a single entry per userid/color/flavor combo

SELECT Min(identity), userid, colour, flavour
FROM myTable
GROUP BY userid, colour, flavour
oops like i misread...

SELECT *
FROM myTable
WHERE myTable.identity IN (SELECT TOP 1 identity FROM myTable B
     WHERE B.userid = myTable.userid
     ORDER BY {insert your order here to determine which record is listed first})
for performance reasons i would use the following
SELECT identity, userid, colour, flavour
FROM  
(SELECT Min(identity) as identity, userid, colour, flavour
FROM myTable
GROUP BY userid, colour, flavour) as innerTable
order By 1
The GROUP BY clause can be sped up if you follow these suggestion:

Keep the number of rows returned by the query as small as possible.

 

Keep the number of groupings as few as possible.

 
reason for my previous post
 snippet from  http://www.sql-server-performance.com/transact_sql.asp


Don't group redundant columns.


If there is a JOIN in the same SELECT statement that has a GROUP BY, try to rewrite the query to use a subquery instead of using a JOIN. If this is possible, performance will be faster. If you have to use a JOIN, try to make the GROUP BY column from the same table as the column or columns on which the set function is used.