Jim Metcalf
asked on
nested select from a select statment with a "having" clause
I need help with a nested select
Basically, i have a billing system where multiple accounts can share the same contact id... (owner of the account). ie... john doe from 100 main street might have 3 accounts with us. I need to identify the accounts that share the same contact id.... which would answer the question... how many customers do we have that have multiple accounts
i can identify all the contact id's that have the multiple accounts with the following query. this does not give me the accountnumbers that i need.
Table 1) account table has the accountnumbers i need to retrieve (this has two columns of interest, Accountnumber, Accountkey(the joining variable)
Table 2) contact table has the contact information (the connector to this table from the account table is the column "AccountKey"
the following query identifies all contactkeys that are tied to multiple accounts
select b.Contactkey, count(b.contactkey) from billing.account a
inner join billing.contact b
on a.accountkey = b.accountkey
group by b.cntctkey
having count(b.cntctkey) > 1
Now i need to select all account numbers from billing.account that have a contact key in the result field from the query above. This is where i am stuck
select all accountnumbers from billing.account where accountkey is in the result of the above mentioned query.
Basically, i have a billing system where multiple accounts can share the same contact id... (owner of the account). ie... john doe from 100 main street might have 3 accounts with us. I need to identify the accounts that share the same contact id.... which would answer the question... how many customers do we have that have multiple accounts
i can identify all the contact id's that have the multiple accounts with the following query. this does not give me the accountnumbers that i need.
Table 1) account table has the accountnumbers i need to retrieve (this has two columns of interest, Accountnumber, Accountkey(the joining variable)
Table 2) contact table has the contact information (the connector to this table from the account table is the column "AccountKey"
the following query identifies all contactkeys that are tied to multiple accounts
select b.Contactkey, count(b.contactkey) from billing.account a
inner join billing.contact b
on a.accountkey = b.accountkey
group by b.cntctkey
having count(b.cntctkey) > 1
Now i need to select all account numbers from billing.account that have a contact key in the result field from the query above. This is where i am stuck
select all accountnumbers from billing.account where accountkey is in the result of the above mentioned query.
ASKER
The inner select is not working on its own
says a.accountkey is invalid in the select list because it is not contained in either an aggregate function or the group by clause
says a.accountkey is invalid in the select list because it is not contained in either an aggregate function or the group by clause
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Your scenario is pretty much the reason why CTE was created. CTE is "Common Table Expression" and basically provide a method to do the subquery first, give it a name, so you can then do "things".
;with CTE_Accounts as
(
select b.Contactkey, count(b.contactkey) c from billing.account a
inner join billing.contact b
on a.accountkey = b.accountkey
group by b.cntctkey
having count(b.cntctkey) > 1
)
select BA.accountkey, BA.Accountnumber
from billing.account BA
inner join CTE_Accounts CTE on CTE.Contactkey = BA.accountkey
I don't believe so. A derived table handles that just fine -- see my code above -- and derived tables have been around for many years.
CTEs were primarily created to enable recursion, including using the new "hierarchyid" data type. There were other reasons, I'm sure, but CTEs are absolutely required for recursion.
CTEs were primarily created to enable recursion, including using the new "hierarchyid" data type. There were other reasons, I'm sure, but CTEs are absolutely required for recursion.
And one of the advantages of CTE is that it establishes / materialises columns (such as count(), row_number() etc).
So it is also good for situations like
And, it is called CTE as per Common Table Expression. They are SQL compliant and part of the ANSI SQL 99 specification.
The fact that they can be used for recursion, is a benefit (or maybe a curse). But if that was the reason, then wouldnt MS have called it RCTE ;)
Anyway can read about it here : https://docs.microsoft.com/en-us/sql/t-sql/queries/with-common-table-expression-transact-sql
So it is also good for situations like
;with CTE_Accounts as
(
select b.Contactkey, count(b.contactkey) ctr from billing.account a
inner join billing.contact b
on a.accountkey = b.accountkey
group by b.cntctkey
)
select BA.accountkey, BA.Accountnumber
from billing.account BA
inner join CTE_Accounts CTE on CTE.Contactkey = BA.accountkey
where CTE.ctr > 1
And, it is called CTE as per Common Table Expression. They are SQL compliant and part of the ANSI SQL 99 specification.
The fact that they can be used for recursion, is a benefit (or maybe a curse). But if that was the reason, then wouldnt MS have called it RCTE ;)
Anyway can read about it here : https://docs.microsoft.com/en-us/sql/t-sql/queries/with-common-table-expression-transact-sql
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Updated query. You have to group by what you're selecting if not aggregating.
select AccountKey, Accountnumber from billing.account
where AccountKey in
(
select a.AccountKey from billing.account a
inner join billing.contact b
on a.accountkey = b.accountkey
group by a.AccountKey
having count(b.cntctkey) > 1
) MultipleContacts
order by contactkey, accountKey
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Would normally complain about Contactkey if grouping by cntctkey.
I figured that the actual column name is "cntctkey" but for posting purposes the author used "Contactkey" just to make it clearer to us. But obviously don't know that for sure.
Apologies, I had a typo and had to fix.... ontactkey s/b contactkey...
The derived table is useful here to avoid joining all the rows in the contact table to the account table. That's just wasteful. We can do the grouping and count check first on the contact table, then only join what's actually needed to the account table. That's why I wrote the query that way, for efficiency.
Yes, a derived table can be useful. Can also be achieved by CTE. But the main point is that Jamesmetcalf74 is happy with
select b.Contactkey, count(b.contactkey) from billing.account a
inner join billing.contact b
on a.accountkey = b.accountkey
group by b.cntctkey
having count(b.cntctkey) > 1
as his method for finding duplicates. That was clearly stated. Maybe it is when tables are joined that the problem manifests itself. While I agree with you in principle, we simply do not know the environment well enough to make assumptions. Who knows ,there may be millions of contacts where only a subset are associated with Billing.Account. I dont think we can make those assumptions without know more. Including the assumption that the stated query runs, or, which column names are the correct ones (50:50 chance).
So, from a solution perspective, the construct of
select b.Contactkey, count(b.contactkey) from billing.account a
inner join billing.contact b
on a.accountkey = b.accountkey
group by b.cntctkey
having count(b.cntctkey) > 1
as his method for finding duplicates. That was clearly stated. Maybe it is when tables are joined that the problem manifests itself. While I agree with you in principle, we simply do not know the environment well enough to make assumptions. Who knows ,there may be millions of contacts where only a subset are associated with Billing.Account. I dont think we can make those assumptions without know more. Including the assumption that the stated query runs, or, which column names are the correct ones (50:50 chance).
So, from a solution perspective, the construct of
;with CTE_Accounts as
( -- insert / replace the query below with your own working query
select b.Contactkey, count(b.contactkey) ctr from billing.account a
inner join billing.contact b
on a.accountkey = b.accountkey
group by b.cntctkey
having count(b.cntctkey) > 1
) -- Now we can use the results of the above to do "things"
select BA.accountkey, BA.Accountnumber
from billing.account BA
inner join CTE_Accounts CTE on CTE.Contactkey = BA.accountkey
Does achieve his stated goal of using the results of his query to then find accountnumbers. Almost literally as described. But, until we hear back from jamesmetcalf74, we will be guessing.
There's no guess. There will be fewer rows after GROUPing than before. That's self-evident.
I've seen many, many times in SQL Server that joining after reducing the set size performs far better than joining afterwards.
I've seen many, many times in SQL Server that joining after reducing the set size performs far better than joining afterwards.
@Jamesmetcalf74,
Is there anything else we can do for you ?
Cheers,
Mark
Is there anything else we can do for you ?
Cheers,
Mark
ASKER
Thank you gentlemen
Open in new window