Link to home
Start Free TrialLog in
Avatar of vacalo69
vacalo69

asked on

SQL DISTINCT beginner question...Probably difficult!

Hi there,

I have a table "Reserved" with the following two columns and five records:

Rec_ID    CID
-------------
 80        14
 80        6
 80        15
 79        6
 78        6

I need a query which is going to do the following:

"Return all rows besides the duplicates that do not have a CID of 14".
I know it might sound weird. Let me explain. There are 3 duplicate records with Rec_ID = 80 and out of these we need the ONE with CID = 14. That is one record "group" that we need.
Also there are two non-duplicate values(REC_ID = 79,78), which are another "group". All we need is the UNION of these two "groups", which will give us:

Rec_ID    CID
-------------
 80        14
 79        6
 78        6

People I would be extremely obliged if you can help me on this by providing me a SQL query.

Thanks a lot in advance

Chris
Avatar of cdillon
cdillon

You could use this:

select rec_id,
(select top 1 cid from reserved where r.rec_id = rec_id and cid <> 14) as mycid
from reserved r
where r.cid <> 14
group by r.rec_id
Avatar of vacalo69

ASKER

This returns:

Rec_ID    CID
-------------
80        6
79        6
78        6

and not

Rec_ID    CID
-------------
80        14
79        6
78        6

Thanks though. Can you fix it?
Thanks
select recID,14
from reserved t1
where 14 in (select cid from reserved t2 where t1.recID = t2.recID)
group by recID
union
select recID,CID
from reserved
where recID not in
(select recID
from reservedt1
where 14 in (select cid from reservedt2 where t1.recID = t2.recID)
group by recID)


yields the following results:

recID   CID
-----------
78     6
79     6
80     14
You could use this:

select rec_id,
(select top 1 cid from reserved where r.rec_id = rec_id and cid <> 14) as mycid
from reserved r
where r.cid <> 14
group by r.rec_id
Sorry, I refreshed the page:  Try

select rec_id,
(select top 1 cid from reserved where r.rec_id = rec_id ) as mycid
from reserved r
group by r.rec_id
Try this:

select rec_id, cid, NULL
from reserved
where cid = 14
union
select rec_id, max(cid), sum(1)
from reserved
group by rec_id
having sum(1) = 1

This should give you the rowset you want, just ignore the third column.
Are these real values? As in, are you looking for something that works with literally this data, or for something that would work in parallel cases when you're not sure of one piece or another?

If this is your real data, you could use
SELECT rec_id, cid
FROM reserved
WHERE cid = 14 OR rec_id <> 80

If this is just sample data, and you would formulate the query more as "Give me the highest cid from any single or group of identical rec_ids," then you'll want something more like cdillon's second example. Union can be pretty intensive when you get much data involved.


Avatar of Scott Pletcher
The query below should return any row with a unique Rec_ID AND any row(s) that have duplicate Rec_IDs and a CID = 14; it will NOT return a row for duplicates that DON'T have a CID = 14.


SELECT Rec_ID, CID
FROM Reserved R1
WHERE NOT EXISTS(
    SELECT 1
    FROM Reserved R2
    WHERE R1.Rec_ID = R2.Rec_ID AND R1.CID <> R2.CID)
OR (EXISTS(
    SELECT 1
    FROM Reserved R2
    WHERE R1.Rec_ID = R2.Rec_ID AND R1.CID <> R2.Rec_ID)
    AND CID = 14)
ASKER CERTIFIED SOLUTION
Avatar of malekam
malekam

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Left out something:

--Distinct Group where CID = 14
select distinct rec_id, cid, NULL
from reserved
where cid = 14
UNION
--Non-duplicates where CID <> 14
select rec_id, max(cid), sum(1)
from reserved
where cid <> 14
group by rec_id
having sum(1) = 1

Unions kinda have a bad rap but they're not so bad, especially in SQL Server.  Many times, a UNION will significantly out-perform an OR predicate, even on very large tables.
Do you have actual, specific results of a UNION, which it seems to forces two table scans rather than one, performing better?  I just can't see how that could be.
I guess I should have been more specific.  The point would be to avoid doing table scans to begin with unless the tables involved are very small. Take advantage of indexes where you can and within reason.  Indexes have tradeoffs too.

I have run across cases when examining the execution plan where a doing one clustered index scan was slower than doing 2 clustered index seeks (which are much faster) and a UNION.  I guess my point is, examine the execution plan and us it to determine your best course of action.
This returns:

Rec_ID    CID
-------------
80        6
79        6
78        6

and not

Rec_ID    CID
-------------
80        14
79        6
78        6

Thanks though. Can you fix it?
Thanks
sorry I refreshed...I will try testing now