• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 485
  • Last Modified:

Subquery returning multiple rows

Table CRS has two records per every record in table D, I want the field CRS.S from table CRS with sequence = 2.  Table D and CRS are joined through D to A; A to AC, AC to CRS.

select id, id2, cs, bks = (SELECT    crs.s
                  from d, A, AC, CRS
                  where D.id2 = A.id2
                             and A.AId = AC.AId
                             and AC.CRtId = CRS.CRId
                       and CRS.sequence = 2)
from d

The subquery is returning multiple records, returning all the records in CRS with a sequence 2.  How can I make it return just one record.  I understand that it is return multiple records because I have not stated to return a specific record in the subquery based on decision, my questions is how.
0
needplease
Asked:
needplease
  • 6
  • 5
  • 4
  • +3
2 Solutions
 
TempDBACommented:
Use distinct with select statement.
0
 
jogosCommented:
You have table d in select and in the subselect. So the filtering on D.id2 = A.id2 is done with d from the subselect -> all rows because there is no link with the D from outside.

Just leave out the reference to table D in the subselect

select id, id2, cs, bks = (SELECT    crs.s
                  from  A, AC, CRS
                  where D.id2 = A.id2
                             and A.AId = AC.AId 
                             and AC.CRtId = CRS.CRId
                       and CRS.sequence = 2)
from d

Open in new window

0
 
needpleaseAuthor Commented:
distinct will not work, you will only end up getting 1 record of each.
0
Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

 
needpleaseAuthor Commented:
Changed it to the following:

select id, id2, cs, bks = (SELECT    crs.s
                  from  A, AC, CRS
                  where A.AId = AC.AId
                             and AC.CRtId = CRS.CRId
                       and CRS.sequence = 2)
from d

And received: Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
0
 
DcpKingCommented:
Your last subquery translates to

select crs.s
from crs
inner join AC   on AC.CRtId = CRS.CRId
inner join A     on A.AId = AC.AId
where CRS.sequence = 2

so you get a value for crs.s for every crs record where there's an AC and an A record. You need to bring something in to define which of these records is the one you want!

You definitely need the link to D - probably like this

inner join D    on A.id2 = D.id2

after the other two inner join lines, but without knowing your data it's difficult to tell - there may be more criteria that you need to join on to constrain the query sufficiently to get just one return result.
0
 
needpleaseAuthor Commented:
Ok data:
Table D:
id            id2             cs
1             1q            john
2             2a            mary
3             3z             tom


Table A:
id2             Aid
1q             card
2a             walllet
3z             both

Table AC:
AID           CRTid
card           h
wallet         u
both           w

Table CRS:
CRID         Sequence              S
h                  1                          350
h                  2                          120
u                  1                          450
u                  2                          110
w                  1                          375
w                  2                          142


Based on the query I need, I should come up with is:
id             id2          cs          bks
1              1q          john       120
2              1a          mary      110
3              1z          tom         142

This is only a snapshot of the data, I am working with hundreds of thousands of records.
0
 
Racim BOUDJAKDJIDatabase Architect - Dba - Data ScientistCommented:
perhaps the below will help...

SELECT distinct
       id, 
       id2, 
       cs, 
       bks = (SELECT distinct crs.s 
              FROM   d, 
                     a, 
                     ac, 
                     crs 
              WHERE  d.id2 = a.id2 
                     AND a.aid = ac.aid 
                     AND ac.crtid = crs.crid 
                     AND crs.SEQUENCE = 2) 
FROM   d 

Open in new window

0
 
jogosCommented:
<<Changed it to the following:>>
The change you made was the solution itself.

The query you posted does not have the link between the select and the subquery . Why did you leave the '  where D.id2 = A.id2' out of my example #a37788504 , that is what provides that filter to get only what is linked with your table d.

On first sight the solution of racimo is identical as in my comment.
0
 
Racim BOUDJAKDJIDatabase Architect - Dba - Data ScientistCommented:
<<On first sight the solution of racimo is identical as in my comment.>>
Perhaps you should look closer...

The solution proposed aimed at primarily aiming at getting rid of any possible duplicates that INNER JOIN/LEFT JOIN operators the subquery generates by using the DISTINCT clause.  Bear in mind that INNER JOIN will generate  a cartesian product of the joking key duplicate rows in each joined table.  For example:
> If Key1 in Table1 has 2 duplicates
> If Key2 in Table2 is joined to Key1 in Table1 has 3 duplicates
> INNER JOIN Key1=Key2 will generate 6 dupplicates
0
 
jogosCommented:
I just refrased it by explicitly naming the joins (same solution as my first post)
and second I got even rid of the subselect -> better performance

-- Making the joins more readable with subselect
select d.id, d.id2, d.cs
, bks = (SELECT  crs.s
         from  A
         inner join AC ON A.AId = AC.AId
         inner join CRS ON AC.CRtId = CRS.CRId
         where D.id2 = A.id2  -- link with table d outside the subselect to filter
         and CRS.sequence = 2 -- the filter
        )
from d

-- Or better without subselect
select d.id, d.id2, d.cs
, crs.s as bks
from d
inner join A on a.id2 = d.id2 -- link with table d
inner join AC ON A.AId = AC.AId
inner join CRS ON AC.CRtId = CRS.CRId
where CRS.sequence = 2 -- the filter

Open in new window

0
 
jogosCommented:
<<<<On first sight the solution of racimo is identical as in my comment.>>
Perhaps you should look closer...>>
indeed, that's what you get on an early saturday morining
0
 
Racim BOUDJAKDJIDatabase Architect - Dba - Data ScientistCommented:
<<I just refrased it by explicitly naming the joins (same solution as my first post)
and second I got even rid of the subselect -> better performance>>
I have read your proposal before posting and you are correct that this proposal improves both query readability and perhaps performance (hard to say, since we do not have information about the index structure and statistics state --> is the optimizer well documented at run time enough to infer the same show plan with and without subselect?).

However, reading the OP posting and question title, it seems the tenure of the question is more about getting rid of duplicates then tuning or query readability.  To that problem, I have not seen a specific answer in the solutions you provided which is why I proposed one.  Hence, I believe your solution and mine  do not bring answers to the same questions and therefore are not identical as you claimed.  We could consider them complementary since they deal with different aspects of improving this query.

Taking your more readable format and merging it with my proposal, adding a DISTINCT clause can only guarantee a duplicate-free result, which I believe is what the OP is essentially looking.  Hence, the code would be...

select DISTINCT d.id, d.id2, d.cs, crs.s as bks
from d
inner join A on a.id2 = d.id2 
inner join AC ON A.AId = AC.AId
inner join CRS ON AC.CRtId = CRS.CRId
where CRS.sequence = 2

Open in new window


Regards,
0
 
jogosCommented:
@racimo
Both in the original query and your post the subselect can run on its own... returning multiple rows and even with a distinct you can have multiple rows.

Why? There is no link that limits the records selected in the subselect to only the records that apply to the record in the 'normal' select.   Having the table d both in the normal select as the subselect is unnecessary.

There where only 2 characters to much in the original query  'd,' in the subselect.

Distinct is a 'heavy' operation and if you don't need it, you should not use it. In to many cases (and here I think it is) it's just hiding a bad selection.
0
 
OCDanCommented:
Here you are mate:

select id, id2, cs, crss.s 'bks'
from
d join
(
select a.id2, crid, s
from crs
join ac on ac.crtid = crs.crid
join a on a.Aid = ac.aid
where sequence = 2
) crss on d.id2 = crss.id2


If you are getting duplicate results with the above code then you have numerous records with sequence = 2 per crs.crid. To check this use the below code:

SELECT crs.crid,
         count(*)
FROM crs
WHERE sequence = 2
GROUP BY crs.crid
HAVING count(*) > 1
0
 
Racim BOUDJAKDJIDatabase Architect - Dba - Data ScientistCommented:
<<Both in the original query and your post the subselect can run on its own returning multiple rows and even with a distinct you can have multiple rows.>>
Please reread my previous comments since I already explained this issue.  I will rephrase it differently:  If duplicates are present in the joining keys, you will get duplicates in the end result no matter how well written is your query or how performant it is because of the way the INNER JOIN/LEFT JOIN works by design.  

<<Why? There is no link that limits the records selected in the subselect to only the records that apply to the record in the 'normal' select.   Having the table d both in the normal select as the subselect is unnecessary. >>
Actually, the compiler/optimizer does take into consideration duplicate-free sub-selects before joining to the normal/upper select when a DISTINCT clause is a part of the subselect.  Run this to understand what I am talking about

select A.ID, A.VALUE
from (select 1 as ID, 'A' as VALUE union all select 1 as ID, 'A' as VALUE) A
inner join (select  AA.ID from (select 1 as ID, 'C' as VALUE union all select 1 as ID, 'D' as VALUE union all select 1 as ID, 'E' as VALUE) AA) B
on A.ID = B.ID

--> 6 duplicates with no DISTINCT

select  A.ID, A.VALUE
from (select 1 as ID, 'A' as VALUE union all select 1 as ID, 'A' as VALUE) A
inner join (select DISTINCT AA.ID from (select 1 as ID, 'C' as VALUE union all select 1 as ID, 'D' as VALUE union all select 1 as ID, 'E' as VALUE) AA) B
on A.ID = B.ID

--> 2 duplicates with DISTINCT in subselect only

select DISTINCT A.ID, A.VALUE
from (select 1 as ID, 'A' as VALUE union all select 1 as ID, 'A' as VALUE) A
inner join (select DISTINCT AA.ID from (select 1 as ID, 'C' as VALUE union all select 1 as ID, 'D' as VALUE union all select 1 as ID, 'E' as VALUE) AA) B
on A.ID = B.ID

--> 0 duplicates with DISTINCT in both subselect and normal/upper select

Open in new window



<<Distinct is a 'heavy' operation and if you don't need it, you should not use it.>>
Agreed.  But that is a trade off one may have to accept to guarantee result correctness, when suspecting duplicates in joining primary and foreign keys .

When that happens, DISTINCT is the simplest way to guarantee correctness, (not necessarily the fastest I admit).  On a restricted number of rows, it makes little or no difference, even performance wise.

<< In to many cases (and here I think it is) it's just hiding a bad selection>>
I don't believe the current question is about whether or not using DISTINCT is a good idea but rather what works best for the OP problem.  I recommend we see how that works for the OP and let him make his mind about that.

Hope this clarifies.
0
 
needpleaseAuthor Commented:
Hey OCDan,

The solution you provided will not work because:
select id, id2, cs, crss.s 'bks'
from
d join
(
select a.id2, crid, s
from crs
join ac on ac.crtid = crs.crid
join a on a.Aid = ac.aid
where sequence = 2
) crss on d.id2 = crss.id2

d and crs don't have a join
0
 
jogosCommented:
Why don't you comment on  posting http://www.experts-exchange.com/Microsoft/Development/Q_27656078.html#a37790704


<<d and crs don't have a join>>
That's why I already made in 3 variations with a join between your D and A, remember that I commented that you cut out the crusual part of my fist solution

Repeat
<<Changed it to the following:>>
The change you made was the solution itself.

The query you posted does not have the link between the select and the subquery . Why did you leave the '  where D.id2 = A.id2' out of my example  http://www.experts-exchange.com/Microsoft/Development/Q_27656078.htm#a37788504 , that is what provides that filter to get only what is linked with your table d.
0
 
needpleaseAuthor Commented:
Thank you
0

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

  • 6
  • 5
  • 4
  • +3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now