al4629740
asked on
Select distinct records from Join statement
On the following query I am looking for distinct records. Many of the records are duplicates. How can I select only the unique, distinct records. The whole row needs to be compared to other rows. I am having a brain fart. Would it be by adding group by H.Agency, H.ActivityDate, H.Classification,H.BasedOn , H.HourTimeFrom, H.HourTimeTo, H.Hours, H.Duration, .Frequency, H.ActivityType, H.ActivityOther, H.Objectives, H.Strategy, H.Need, H.NeedOther, H.Outcome, A.ActivityName, H.Narrative, H.EntryTime, H.Fiscal, H.Unregistered
at the end?
at the end?
SELECT H.Agency, H.ActivityDate, H.Classification,H.BasedOn, H.HourTimeFrom,
H.HourTimeTo, H.Hours, H.Duration, H.Frequency, H.ActivityType, H.ActivityOther,
H.Objectives, H.Strategy, H.Need, H.NeedOther,
H.Outcome, A.ActivityName, H.Narrative, H.EntryTime, H.Fiscal, H.Unregistered
FROM tblOrgHours H
INNER JOIN tblOrgActivities A
ON H.ActivityID = A.ActivityID
WHERE H.Unregistered > 0
ASKER
can I use group by?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
And that will get the same results?
What does without aggregation mean?
What does without aggregation mean?
select distinct <data>
is equivalent to
select <data>
group by <data>
it mostly depends on the where clause and joins as to how / allowing SQL optimizer to exclude rows from the comparison.
Essentially, DISTINCT will gather all information, do any cast or converts (etc) and then decide DISTINCT
Group By can (not always) filter out some data before having to decide the result set.
If you look at the logical processing order of execution :
1. FROM
2. ON
3. OUTER
4. WHERE
5. GROUP BY
6. CUBE | ROLLUP
7. HAVING
8. SELECT
9. DISTINCT
10 ORDER BY
11. TOP
You see DISTINCT happening late in the cycle - so - if you cant eliminate unwanted information first, it has a potential truckload of work to decide distinct.
In your example, there is not much difference and there are no aggregations - so DISTINCT is probably fine
if ever in doubt, eliminate the unwanted stuff first with a CTE
with cte as
(select <data> where conditions and manipulations occur and exclusions )
select distinct <refined data>
from CTE
aggregation means you want to use SUM() or COUNT() an element not contained in your group by, and distinct wont allow aggregations per se - that is when you need to use grouping (and grouping can include window functions using the OVER() clause)
but you have already selected your answer :) I just wish I remembered to hit submit when I started answering....
is equivalent to
select <data>
group by <data>
it mostly depends on the where clause and joins as to how / allowing SQL optimizer to exclude rows from the comparison.
Essentially, DISTINCT will gather all information, do any cast or converts (etc) and then decide DISTINCT
Group By can (not always) filter out some data before having to decide the result set.
If you look at the logical processing order of execution :
1. FROM
2. ON
3. OUTER
4. WHERE
5. GROUP BY
6. CUBE | ROLLUP
7. HAVING
8. SELECT
9. DISTINCT
10 ORDER BY
11. TOP
You see DISTINCT happening late in the cycle - so - if you cant eliminate unwanted information first, it has a potential truckload of work to decide distinct.
In your example, there is not much difference and there are no aggregations - so DISTINCT is probably fine
if ever in doubt, eliminate the unwanted stuff first with a CTE
with cte as
(select <data> where conditions and manipulations occur and exclusions )
select distinct <refined data>
from CTE
aggregation means you want to use SUM() or COUNT() an element not contained in your group by, and distinct wont allow aggregations per se - that is when you need to use grouping (and grouping can include window functions using the OVER() clause)
but you have already selected your answer :) I just wish I remembered to hit submit when I started answering....
>> What does without aggregation mean?
As Mark mentioned above, Aggregation operations refers to Min(), MAX(), SUM(), COUNT(), etc., Since your query doesn't have any of the aggregations, you can use either DISTINCT or GROUP BY. If you have aggregations, then you might need to use GROUP BY only..
As Mark mentioned above, Aggregation operations refers to Min(), MAX(), SUM(), COUNT(), etc., Since your query doesn't have any of the aggregations, you can use either DISTINCT or GROUP BY. If you have aggregations, then you might need to use GROUP BY only..
Open in new window