asked on

nested select from a select statment with a "having" clause

I need help with a nested select

Basically, i have a billing system where multiple accounts can share the same contact id... (owner of the account). ie... john doe from 100 main street might have 3 accounts with us. I need to identify the accounts that share the same contact id.... which would answer the question... how many customers do we have that have multiple accounts

i can identify all the contact id's that have the multiple accounts with the following query. this does not give me the accountnumbers that i need.

Table 1) account table has the accountnumbers i need to retrieve (this has two columns of interest, Accountnumber, Accountkey(the joining variable)
Table 2) contact table has the contact information (the connector to this table from the account table is the column "AccountKey"

the following query identifies all contactkeys that are tied to multiple accounts

select b.Contactkey, count(b.contactkey) from billing.account a
inner join billing.contact b
on a.accountkey = b.accountkey
group by b.cntctkey
having count(b.cntctkey) > 1

Now i need to select all account numbers from billing.account that have a contact key in the result field from the query above. This is where i am stuck

select all accountnumbers from billing.account where accountkey is in the result of the above mentioned query.

Kyle Abrahams, PMP

select AccountKey, Accountnumber from billing.account
where AccountKey in
(
select  a.AccountKey from billing.account a
inner join billing.contact b
on a.accountkey = b.accountkey
group by b.cntctkey
having count(b.cntctkey) > 1
) MultipleContacts
order by contactkey, accountKey

Open in new window

Jim Metcalf

ASKER

The inner select is not working on its own

says a.accountkey is invalid in the select list because it is not contained in either an aggregate function or the group by clause

ASKER CERTIFIED SOLUTION

Scott Pletcher

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Mark Wills

Your scenario is pretty much the reason why CTE was created. CTE is "Common Table Expression" and basically provide a method to do the subquery first, give it a name, so you can then do "things".

;with CTE_Accounts as
(
   select  b.Contactkey, count(b.contactkey) c from billing.account a
   inner join billing.contact b
   on a.accountkey = b.accountkey
   group by b.cntctkey
   having count(b.cntctkey) > 1
)  
   select BA.accountkey, BA.Accountnumber
   from billing.account BA
   inner join CTE_Accounts CTE on CTE.Contactkey = BA.accountkey

Open in new window

Scott Pletcher

I don't believe so. A derived table handles that just fine -- see my code above -- and derived tables have been around for many years.

CTEs were primarily created to enable recursion, including using the new "hierarchyid" data type. There were other reasons, I'm sure, but CTEs are absolutely required for recursion.

Mark Wills

And one of the advantages of CTE is that it establishes / materialises columns (such as count(), row_number() etc).

So it is also good for situations like

;with CTE_Accounts as
(
   select  b.Contactkey, count(b.contactkey) ctr from billing.account a
   inner join billing.contact b
   on a.accountkey = b.accountkey
   group by b.cntctkey
)  
   select BA.accountkey, BA.Accountnumber
   from billing.account BA
   inner join CTE_Accounts CTE on CTE.Contactkey = BA.accountkey 
   where CTE.ctr > 1

Open in new window

And, it is called CTE as per Common Table Expression. They are SQL compliant and part of the ANSI SQL 99 specification.

The fact that they can be used for recursion, is a benefit (or maybe a curse). But if that was the reason, then wouldnt MS have called it RCTE ;)

Anyway can read about it here : https://docs.microsoft.com/en-us/sql/t-sql/queries/with-common-table-expression-transact-sql

SOLUTION

Scott Pletcher

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Kyle Abrahams, PMP

Updated query. You have to group by what you're selecting if not aggregating.

select AccountKey, Accountnumber from billing.account
where AccountKey in
(
select  a.AccountKey from billing.account a
inner join billing.contact b
on a.accountkey = b.accountkey
group by a.AccountKey
having count(b.cntctkey) > 1
) MultipleContacts
order by contactkey, accountKey

Open in new window

SOLUTION

Mark Wills

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Scott Pletcher

Would normally complain about Contactkey if grouping by cntctkey.

I figured that the actual column name is "cntctkey" but for posting purposes the author used "Contactkey" just to make it clearer to us. But obviously don't know that for sure.

Mark Wills

Apologies, I had a typo and had to fix.... ontactkey s/b contactkey...

Scott Pletcher

The derived table is useful here to avoid joining all the rows in the contact table to the account table. That's just wasteful. We can do the grouping and count check first on the contact table, then only join what's actually needed to the account table. That's why I wrote the query that way, for efficiency.

Mark Wills

Yes, a derived table can be useful. Can also be achieved by CTE. But the main point is that Jamesmetcalf74 is happy with

select b.Contactkey, count(b.contactkey) from billing.account a
inner join billing.contact b
on a.accountkey = b.accountkey
group by b.cntctkey
having count(b.cntctkey) > 1

as his method for finding duplicates. That was clearly stated. Maybe it is when tables are joined that the problem manifests itself. While I agree with you in principle, we simply do not know the environment well enough to make assumptions. Who knows ,there may be millions of contacts where only a subset are associated with Billing.Account. I dont think we can make those assumptions without know more. Including the assumption that the stated query runs, or, which column names are the correct ones (50:50 chance).

So, from a solution perspective, the construct of

;with CTE_Accounts as
(  -- insert / replace the query below with your own working query
   select  b.Contactkey, count(b.contactkey) ctr from billing.account a
   inner join billing.contact b
   on a.accountkey = b.accountkey
   group by b.cntctkey
   having count(b.cntctkey) > 1
)  -- Now we can use the results of the above to do "things"
   select BA.accountkey, BA.Accountnumber
   from billing.account BA
   inner join CTE_Accounts CTE on CTE.Contactkey = BA.accountkey

Open in new window

Does achieve his stated goal of using the results of his query to then find accountnumbers. Almost literally as described. But, until we hear back from jamesmetcalf74, we will be guessing.

Scott Pletcher

There's no guess. There will be fewer rows after GROUPing than before. That's self-evident.

I've seen many, many times in SQL Server that joining after reducing the set size performs far better than joining afterwards.

Mark Wills

@Jamesmetcalf74,

Is there anything else we can do for you ?

Cheers,
Mark

Jim Metcalf

ASKER

Thank you gentlemen

nested select from a select statment with a &quot;having&quot; clause

nested select from a select statment with a "having" clause