What's better, GROUP BY or DISTINCT

Hello everybody,

If I want to get distinct values, I can use either DISTINCT or GROUP BY - the usage can vary of course, but this two queries:

SELECT DISTINCT city FROM clients

and

SELECT city FROM clients GROUP BY city

the resultset will be identical. I have heard or read somewhere that using DISTINCT should be avoided if possible - I don't insist on this statement but since it's sitting somewhere at the back of my head, I'm just wondering, is it really bad to use DISTINCT in this case (the size of the table can be rather big), or using GROUP BY is better. And if yes, what's the advantage of one over another and vice versa.

Thanks you,
Yurich
LVL 21
YurichAsked:
Who is Participating?
 
imran_fastCommented:
SQL Server will generate the same plan for both. Use
GROUP BY if you are generating some aggregations. Else stick with DISTINCT
because it is easier to read. Concentrating on the logical construct
simplifies problems in general.
SQL 2000 DISTINCT operator is physically impemented as GROUP BY, so
they both shoud perform the same.

<<using DISTINCT should be avoided if possible>>
yes it should be avoided so that  the developer should be careful about joins.
0
 
Patrick MatthewsCommented:
Hi Yurich,

Depends on what you need to do.  Let's say you need to get one record per customer, and an average of orders
by that customer in a date range.  Use GROUP BY, because you are computing an aggregate value (the average
of 1+ orders).

If you simply wanted to know which customers made any orders at all in the period, DISTINCT would be fine,
but make sure you limit your columns in the SELECT clause to those columns that produce a unique row.

Regards,

Patrick
0
 
YurichAuthor Commented:
Thanks Patrick,

"...make sure you limit your columns in the SELECT clause to those columns that produce a unique row"

That would be possible if I have duplicates in my table, right?

Regs,
Yurich
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
imran_fastCommented:
<<That would be possible if I have duplicates in my table, right?>>
yes.
0
 
YurichAuthor Commented:
Thanks again,

"SQL 2000 DISTINCT operator is physically impemented as GROUP BY" - that's my answer ;)

Appreciate your help,
Yurich
0
 
Patrick MatthewsCommented:
Yurich,

> > "...make sure you limit your columns in the SELECT clause to those columns that produce a unique row"

> That would be possible if I have duplicates in my table, right?

That depends.  In the example I gave, if you decided to go with SELECT DISTINCT, I would definitely leave
out a column like OrderID or OrderAmount or OrderDate: if the customer had more than one order in
the period, there will be multiple values in that column for the same customer, and the SELECT DISTINCT
would return all the records, because the whole record defines the uniqueness.

Patrick
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.