Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

What's better, GROUP BY or DISTINCT

Posted on 2006-04-08
6
Medium Priority
?
271 Views
Last Modified: 2011-10-03
Hello everybody,

If I want to get distinct values, I can use either DISTINCT or GROUP BY - the usage can vary of course, but this two queries:

SELECT DISTINCT city FROM clients

and

SELECT city FROM clients GROUP BY city

the resultset will be identical. I have heard or read somewhere that using DISTINCT should be avoided if possible - I don't insist on this statement but since it's sitting somewhere at the back of my head, I'm just wondering, is it really bad to use DISTINCT in this case (the size of the table can be rather big), or using GROUP BY is better. And if yes, what's the advantage of one over another and vice versa.

Thanks you,
Yurich
0
Comment
Question by:Yurich
  • 2
  • 2
  • 2
6 Comments
 
LVL 93

Assisted Solution

by:Patrick Matthews
Patrick Matthews earned 1000 total points
ID: 16406930
Hi Yurich,

Depends on what you need to do.  Let's say you need to get one record per customer, and an average of orders
by that customer in a date range.  Use GROUP BY, because you are computing an aggregate value (the average
of 1+ orders).

If you simply wanted to know which customers made any orders at all in the period, DISTINCT would be fine,
but make sure you limit your columns in the SELECT clause to those columns that produce a unique row.

Regards,

Patrick
0
 
LVL 21

Author Comment

by:Yurich
ID: 16406963
Thanks Patrick,

"...make sure you limit your columns in the SELECT clause to those columns that produce a unique row"

That would be possible if I have duplicates in my table, right?

Regs,
Yurich
0
 
LVL 28

Accepted Solution

by:
imran_fast earned 1000 total points
ID: 16406965
SQL Server will generate the same plan for both. Use
GROUP BY if you are generating some aggregations. Else stick with DISTINCT
because it is easier to read. Concentrating on the logical construct
simplifies problems in general.
SQL 2000 DISTINCT operator is physically impemented as GROUP BY, so
they both shoud perform the same.

<<using DISTINCT should be avoided if possible>>
yes it should be avoided so that  the developer should be careful about joins.
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 28

Expert Comment

by:imran_fast
ID: 16406977
<<That would be possible if I have duplicates in my table, right?>>
yes.
0
 
LVL 21

Author Comment

by:Yurich
ID: 16407006
Thanks again,

"SQL 2000 DISTINCT operator is physically impemented as GROUP BY" - that's my answer ;)

Appreciate your help,
Yurich
0
 
LVL 93

Expert Comment

by:Patrick Matthews
ID: 16411874
Yurich,

> > "...make sure you limit your columns in the SELECT clause to those columns that produce a unique row"

> That would be possible if I have duplicates in my table, right?

That depends.  In the example I gave, if you decided to go with SELECT DISTINCT, I would definitely leave
out a column like OrderID or OrderAmount or OrderDate: if the customer had more than one order in
the period, there will be multiple values in that column for the same customer, and the SELECT DISTINCT
would return all the records, because the whole record defines the uniqueness.

Patrick
0

Featured Post

Get free NFR key for Veeam Availability Suite 9.5

Veeam is happy to provide a free NFR license (1 year, 2 sockets) to all certified IT Pros. The license allows for the non-production use of Veeam Availability Suite v9.5 in your home lab, without any feature limitations. It works for both VMware and Hyper-V environments

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Recently we ran in to an issue while running some SQL jobs where we were trying to process the cubes.  We got an error saying failure stating 'NT SERVICE\SQLSERVERAGENT does not have access to Analysis Services. So this is a way to automate that wit…
One of the most important things in an application is the query performance. This article intends to give you good tips to improve the performance of your queries.
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.
Suggested Courses

810 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question