Link to home
Start Free TrialLog in
Avatar of Jim Metcalf
Jim MetcalfFlag for United States of America

asked on

nested select from a select statment with a "having" clause

I need help with a nested select

Basically, i have a billing system where multiple accounts can share the same contact id... (owner of the account).  ie... john doe from 100 main street might have 3 accounts with us.  I need to identify the accounts that share the same contact id....    which would answer the question...   how many customers do we have that have multiple accounts

i can identify all the contact id's that have the multiple accounts with the following query.  this does not give me the accountnumbers that i need.

Table 1) account table has the accountnumbers i need to retrieve  (this has two columns of interest, Accountnumber, Accountkey(the joining variable)
Table 2) contact table has the contact information (the connector to this table from the account table is the column "AccountKey"

the following query identifies all contactkeys that are tied to multiple accounts

select  b.Contactkey, count(b.contactkey) from billing.account a
inner join billing.contact b
on a.accountkey = b.accountkey
group by b.cntctkey
having count(b.cntctkey) > 1

Now i need to select all account numbers from billing.account that have a contact key in the result field from the query above.  This is where i am stuck

select all accountnumbers from billing.account where accountkey is in the result of the above mentioned query.
Avatar of Kyle Abrahams, PMP
Kyle Abrahams, PMP
Flag of United States of America image

select AccountKey, Accountnumber from billing.account
where AccountKey in
(
select  a.AccountKey from billing.account a
inner join billing.contact b
on a.accountkey = b.accountkey
group by b.cntctkey
having count(b.cntctkey) > 1
) MultipleContacts
order by contactkey, accountKey

Open in new window

Avatar of Jim Metcalf

ASKER

The inner select is not working on its own

says a.accountkey is invalid in the select list because it is not contained in either an aggregate function or the group by clause
ASKER CERTIFIED SOLUTION
Avatar of Scott Pletcher
Scott Pletcher
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Your scenario is pretty much the reason why CTE was created. CTE is "Common Table Expression" and basically provide a method to do the subquery first, give it a name, so you can then do "things".

;with CTE_Accounts as
(
   select  b.Contactkey, count(b.contactkey) c from billing.account a
   inner join billing.contact b
   on a.accountkey = b.accountkey
   group by b.cntctkey
   having count(b.cntctkey) > 1
)  
   select BA.accountkey, BA.Accountnumber
   from billing.account BA
   inner join CTE_Accounts CTE on CTE.Contactkey = BA.accountkey 
  

Open in new window

I don't believe so.  A derived table handles that just fine -- see my code above -- and derived tables have been around for many years.

CTEs were primarily created to enable recursion, including using the new "hierarchyid" data type.  There were other reasons, I'm sure, but CTEs are absolutely required for recursion.
And one of the advantages of CTE is that it establishes / materialises columns (such as count(), row_number() etc).

So it is also good for situations like
;with CTE_Accounts as
(
   select  b.Contactkey, count(b.contactkey) ctr from billing.account a
   inner join billing.contact b
   on a.accountkey = b.accountkey
   group by b.cntctkey
)  
   select BA.accountkey, BA.Accountnumber
   from billing.account BA
   inner join CTE_Accounts CTE on CTE.Contactkey = BA.accountkey 
   where CTE.ctr > 1 
  

Open in new window


And, it is called CTE as per Common Table Expression. They are SQL compliant and part of the ANSI SQL 99 specification.

The fact that they can be used for recursion, is a benefit (or maybe a curse). But if that was the reason, then wouldnt MS have called it RCTE ;)

Anyway can read about it here : https://docs.microsoft.com/en-us/sql/t-sql/queries/with-common-table-expression-transact-sql
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Updated query.  You have to group by what you're selecting if not aggregating.

select AccountKey, Accountnumber from billing.account
where AccountKey in
(
select  a.AccountKey from billing.account a
inner join billing.contact b
on a.accountkey = b.accountkey
group by a.AccountKey
having count(b.cntctkey) > 1
) MultipleContacts
order by contactkey, accountKey

Open in new window

SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Would normally complain about Contactkey if grouping by cntctkey.

I figured that the actual column name is "cntctkey" but for posting purposes the author used "Contactkey" just to make it clearer to us.  But obviously don't know that for sure.
Apologies, I had a typo and had to fix....  ontactkey s/b contactkey...
The derived table is useful here to avoid joining all the rows in the contact table to the account table.  That's just wasteful.  We can do the grouping and count check first on the contact table, then only join what's actually needed to the account table.  That's why I wrote the query that way, for efficiency.
Yes, a derived table can be useful. Can also be achieved by CTE. But the main point is that Jamesmetcalf74 is happy with

    select  b.Contactkey, count(b.contactkey)  from billing.account a
    inner join billing.contact b
    on a.accountkey = b.accountkey
    group by b.cntctkey
   having count(b.cntctkey) > 1

as his method for finding duplicates. That was clearly stated. Maybe it is when tables are joined that the problem manifests itself. While I agree with you in principle, we simply do not know the environment well enough to make assumptions. Who knows ,there may be millions of contacts where only a subset are associated with Billing.Account. I dont think we can make those assumptions without know more. Including the assumption that the stated query runs, or, which column names are the correct ones (50:50 chance).

So, from a solution perspective, the construct of
;with CTE_Accounts as
(  -- insert / replace the query below with your own working query
   select  b.Contactkey, count(b.contactkey) ctr from billing.account a
   inner join billing.contact b
   on a.accountkey = b.accountkey
   group by b.cntctkey
   having count(b.cntctkey) > 1
)  -- Now we can use the results of the above to do "things"
   select BA.accountkey, BA.Accountnumber
   from billing.account BA
   inner join CTE_Accounts CTE on CTE.Contactkey = BA.accountkey 

Open in new window

Does achieve his stated goal of using the results of his query to then find accountnumbers. Almost literally as described. But, until we hear back from jamesmetcalf74, we will be guessing.
There's no guess.  There will be fewer rows after GROUPing than before.  That's self-evident.

I've seen many, many times in SQL Server that joining after reducing the set size performs far better than joining afterwards.
@Jamesmetcalf74,

Is there anything else we can do for you ?

Cheers,
Mark
Thank you gentlemen