Solved

Select distinctly using GROUP BY

Posted on 2004-08-16
5
223 Views
Last Modified: 2008-02-20
I am trying to remove duplicate rows from a query using GROUP BY and max but I dont seem to be having any joy.

I have 3 tables:

tblCourses:
keyCourse
Title

tblSubjects:
keySubject
Subject

tblCourses_Subjects:
keyFCourse
keyFSubject

tblCourses_Subjects is an allocation table allowing it to be a many-many relationship.

I now want to query this by selecting ALL course WHERE a boolean is set as per this query:

CREATE PROCEDURE Courses_GetFeaturedCourses
AS
SELECT DISTINCT max(tblCourses_Subjects.keyFSubject), tblSubjects.keySubject, tblSubjects.Subject, tblCourses.keyCourse, tblCourses.Title, tblCourses.Active FROM tblCourses INNER JOIN tblCourses_Subjects ON tblCourses.keyCourse = tblCourses_Subjects.keyFCourse INNER JOIN tblSubjects ON tblCourses_Subjects.keyFSubject = tblSubjects.keySubject
WHERE tblCourses.Active = 1 AND tblCourses.Featured = 1
GROUP BY tblSubjects.keySubject, tblSubjects.Subject, tblCourses.keyCourse, tblCourses.Title, tblCourses.Active
GO

The problem is that this return duplications for courses which are in multiple subjects. I thought I could get round this by using max() (on keyFSubject) and GROUP BY, but it still appears to be giving me duplications.

Output:
keySubject, keyCourse, Title, Subject

Output:
1, 1, Title, Subject
1, 1, Title, Subject
2, 2, Title, Subject
3, 8, Title, Subject
4, 12, Title, Subject

It is the SECOND row of output I'm trying to get rid of.

Many thanks


0
Comment
Question by:jonnyboy69
  • 2
  • 2
5 Comments
 
LVL 17

Accepted Solution

by:
BillAn1 earned 150 total points
Comment Utility
You have MAX, but you are still including the subject  key without a max, and also the subject name as well
You could do as below, but it does have one problem - if a course is associated with more than 1 subject, it will return the maximum subject key, and the "maximum" subject name - you could have one subject with key 1, subject Zoology, and another record with key 99, subject Anatomy. Of both of htese subjects are part of the "Biology" course, then this query will return you a record with 99 Zoology Biology, since 99 is the max key, and Zoology is the max name, but of course 99 & Zoology don't go together.
Normally, if you want a distinct, it makes more sense to leave out altoghether any columns that have multiple values, particularly if they are text.

SELECT DISTINCT
    max(tblCourses_Subjects.keyFSubject),
    max(tblSubjects.Subject),
    tblCourses.keyCourse,
    tblCourses.Title,
    tblCourses.Active
FROM tblCourses
INNER JOIN
    tblCourses_Subjects
    ON tblCourses.keyCourse = tblCourses_Subjects.keyFCourse
INNER JOIN
    tblSubjects
    ON tblCourses_Subjects.keyFSubject = tblSubjects.keySubject
WHERE tblCourses.Active = 1
    AND tblCourses.Featured = 1
GROUP BY
    tblCourses.keyCourse,
    tblCourses.Title,
    tblCourses.Active
0
 

Author Comment

by:jonnyboy69
Comment Utility
Nope thats the kiddie. I do note your point about max problems, however it doesnt actually matter what the subject values are in this case, as long as they are valid.
Thanks
0
 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
I think the query below will do it, and return consistent/matching subject keys, course keys, titles and subjects.


SELECT uniqCourseSubject.keyFSubject AS KeySubject, tc.KeyCourse, tc.Title,
      (SELECT TOP 1 ts.Subject
      FROM tblSubjects ts
      WHERE uniqCourseSubject.keyFSubject = ts.keySubject) AS Subject
FROM tblCourses tc
INNER JOIN (
      SELECT keyFCourse, keyFSubject
      FROM tblCourses_Subjects tcs
      GROUP BY keyFCourse, keyFSubject
) AS uniqCourseSubject ON tc.keyCourse = uniqCourseSubject.keyFCourse
WHERE tc.Active = 1 AND tc.Featured = 1
0
 
LVL 17

Expert Comment

by:BillAn1
Comment Utility
I think you need to modify that a bit, jonnyboy69 only wants one subject per course,

select ts.keyFSubject , tc.KeyCourse, tc.Title, ts.Subject
from tblCourses tc inner join
(
select keyFCourse, max(keyFSubject) keyFSubject
     from  tblCourses_Subjects tcs
     group by keyFCourse
) uq
on tc.keyCourse = uq.keyFCourse
inner join tblSubjects ts
on ts.keySubject = uq.keyFSubject
0
 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
D'OH:

Yes, I meant to do that (actually I was planning to use MIN(), but MAX() is good too :-) ).  Sorry, don't have any data to test with so it's to overlook things.
0

Featured Post

Control application downtime with dependency maps

Visualize the interdependencies between application components better with Applications Manager's automated application discovery and dependency mapping feature. Resolve performance issues faster by quickly isolating problematic components.

Join & Write a Comment

Having an SQL database can be a big investment for a small company. Hardware, setup and of course, the price of software all add up to a big bill that some companies may not be able to absorb.  Luckily, there is a free version SQL Express, but does …
Everyone has problem when going to load data into Data warehouse (EDW). They all need to confirm that data quality is good but they don't no how to proceed. Microsoft has provided new task within SSIS 2008 called "Data Profiler Task". It solve th…
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed
Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now