?
Solved

SQL Correlated Subquery in SELECT clause

Posted on 2014-08-01
10
Medium Priority
?
539 Views
Last Modified: 2014-08-05
I have a query that returns the following data:

Name            Age      Phone
John Smith    40        555-5555
John Smith    40        222-2222

I want it to only return a single row, because this is causing us to have duplicates in the report.
Current Query
Select Contact.Name, Contact.Age, ContactInfo.Phone
From Contact left join ContactInfo on Contact.Id = ContactInfo.Id

Open in new window


New Query
Select Contact.Name, Contact.Age, 
 ( Select TOP 1 CI.Phone
   From ContactInfo CI left join Contact C ON C.Id = CI.Id
   Where C.Id = Contact.Id
   Order By CI.Phone ASC
 ) as Phone
From Contact left join ContactInfo on Contact.Id = ContactInfo.Id

Open in new window


Is there anything you see wrong with the correlated subquery? It's giving me an error message:

Column 'Contact.Id' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
0
Comment
Question by:pzozulka
  • 3
  • 3
  • 2
  • +2
10 Comments
 
LVL 34

Expert Comment

by:Brian Crowe
ID: 40234646
You need to define your logic for determining which phone number you want.  Do you want the newest/oldest (if so do you have a way to determine this) or some other method?  Taking the TOP 1 by sorting by the phone number itself doesn't make much sense.  If you're going to do that then you might as well use:

SELECT C.Name, C.Age, MIN(CI.Phone) AS Phone
From Contact AS C
LEFT OUTER JOIN ContactInfo AS CI
   ON C.Id = CI.Id
GROUP BY C.Name, C.Age

Open in new window

0
 
LVL 8

Author Comment

by:pzozulka
ID: 40234669
I just want to understand what I was doing wrong. Is it because I was using C.Id in my subquery and did not have that field in my outer query's Group By?
0
 
LVL 70

Expert Comment

by:Scott Pletcher
ID: 40234674
OUTER APPLY would likely be the best current way to do that:

Select c.Name, c.Age, oa1.Phone
From Contact c
Outer Apply (
    Select TOP (1) ci.Phone
    From ContactInfo ci
    WHERE ci.Id = c.Id
    Order By ci.Phone ASC
) AS oa1
0
Get free NFR key for Veeam Availability Suite 9.5

Veeam is happy to provide a free NFR license (1 year, 2 sockets) to all certified IT Pros. The license allows for the non-production use of Veeam Availability Suite v9.5 in your home lab, without any feature limitations. It works for both VMware and Hyper-V environments

 
LVL 14

Expert Comment

by:Russell Fox
ID: 40235128
It's because you need to alias the outer Contact, otherwise the inner query doesn't know what "Contact.ID" refers to - it probably thinks you're referring to the inner Contact table aliased as "C", but without using the alias. Also, since you're using the subquery to get the phone number, you don't need those other joins:
Select t1.NAME
	, t1.Age 
	,	(Select TOP 1 CI.Phone
		From ContactInfo CI
		Where C.Id = t1.Id
		Order By CI.Phone ASC
		) 
	AS Phone
From Contact t1

Open in new window

0
 
LVL 49

Expert Comment

by:PortletPaul
ID: 40236055
the error message in the question does not relate to the query in the question.
Column 'Contact.Id' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
There is no "group by" clause displayed in that query

However either the correlated subquery in outer apply or select should alleviate you problem. But don't then left join back to ContactInfo or you will still have unwanted repetition in the result. If you need more than just one field from ContactInfo then I would do it a different way - using row_number()
SELECT
      Contact.Name
    , Contact.Age
    , CI.*
FROM Contact
      LEFT JOIN (
                  SELECT
                        ContactInfo.*
                      , ROW_NUMBER() OVER (PARTITION BY ContactInfo.Id
                                           ORDER BY ContactInfo.phone ASC) AS rn
                  FROM ContactInfo
            ) AS CI
                  ON Contact.Id = CI.Id
                        AND CI.rn = 1

Open in new window

However I agree with Brian Crowe, sorting just on phone number to select which contact information is relevant seems decidedly odd. Is there no concept of "preferred" contact details or similar to work with?
0
 
LVL 8

Author Comment

by:pzozulka
ID: 40241872
Taking the TOP 1 by sorting by the phone number itself doesn't make much sense.  If you're going to do that then you might as well use:
The actual query is slightly more complicated than that. There are actually 2 columns that I need to select, but only want one row to appear.  This was just a silly example I tried to come up with to represent the actual problem.

I tried using the suggested approach of simply using MIN(column1), MIN(column2) and it seems to work fine except for one problem -- it shows the minimum of whatever is in column1 and the minimum of whatever is in column2, but they are not necessarily representing the same row.

ID      Name            Age      Phone           Description
1        John Smith    40        555-5555      Cell
2        John Smith    40        222-2222      Fax

Let's just use this example. The real example uses Territory (ex. southern, eastern, western), and a Description (ex. California, Texas, New York).

When trying to SELECT the MIN(Phone), MIN(Description) it seems to work out great until you run a few test cases and notice that at times you get the following results:

1     John Smith   40 222-2222   Cell

Which is obviously wrong because that number is not a Cell Phone. I need the results to show 222-2222 Fax.
0
 
LVL 8

Author Comment

by:pzozulka
ID: 40241953
ScottPletcher: I tried the Outer Apply approach too, and am not getting the single row results I am looking for because it's making me put the oa1.column field into my main Group By clause. If I don't do that then I get the following error message:

Column 'oa1.ColumnName' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
0
 
LVL 70

Expert Comment

by:Scott Pletcher
ID: 40242036
You could just add a MIN or MAX around the phone number instead, to limit it to a single value.

SELECT ..., MIN(oa1.ColumnName) | MAX(oa1.ColumnName), ...
...
OUTER APPLY (
    ...
) AS oa1
GROUP BY ...
0
 
LVL 49

Accepted Solution

by:
PortletPaul earned 2000 total points
ID: 40242678
"it shows the minimum of whatever is in column1 and the minimum of whatever is in column2, but they are not necessarily representing the same row." (emphasis added)

That's correct, that is precisely what MIN() will do; return the minimum, and for 2 columns those minimums do not necessarily come from the same record.

If returning a particular record is what you need then do NOT use "group by".

Instead use row_number()
I have already proposed this at http:#a40236055
0
 
LVL 49

Assisted Solution

by:PortletPaul
PortletPaul earned 2000 total points
ID: 40242682
This result
| ID |       NAME | AGE |    PHONE | DESCRIPTION |
|----|------------|-----|----------|-------------|
|  2 | John Smith |  40 | 222-2222 |         Fax |

Open in new window

is from this query:
SELECT
      ID
    , Name
    , Age
    , Phone
    , Description
FROM (
            SELECT
                  ID
                , Name
                , Age
                , Phone
                , Description
                , ROW_NUMBER() OVER (PARTITION BY name, age 
                                     ORDER BY phone ASC)   AS rn
            FROM ContactInfo
      ) AS derived
WHERE rn = 1



CREATE TABLE ContactInfo
	([ID] int, [Name] varchar(10), [Age] int, [Phone] varchar(8), [Description] varchar(4))
;
	
INSERT INTO ContactInfo
	([ID], [Name], [Age], [Phone], [Description])
VALUES
	(1, 'John Smith', 40, '555-5555', 'Cell'),
	(2, 'John Smith', 40, '222-2222', 'Fax')
;

Open in new window

Try it yourself here: http://sqlfiddle.com/#!3/2f0a0/1
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Ready to get certified? Check out some courses that help you prepare for third-party exams.
Microsoft Access has a limit of 255 columns in a single table; SQL Server allows tables with over 255 columns, but reading that data is not necessarily simple.  The final solution for this task involved creating a custom text parser and then reading…
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.
Suggested Courses

840 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question