Solved

Ms SQL Compound Statement that queries the results of a UNION

Posted on 2009-05-08
15
608 Views
Last Modified: 2012-05-06
I am trying to write a compound state that queries the results of a UNION in MS SQL 2005.  I am trying the following structure but can't get it to work:

Select XXXX
(
select YYYYYY
UNION
select ZZZZZ
)

The result of the union looks like
name1 | 200
name2 | 100
name1 | 300
name2 | 150

I want to run a query on the result of the union so I end up with
name1 | 500
name2 | 250

select PrimaryAuthor, SUM(ND1) FROM

(select PrimaryAuthor, COUNT(PrimaryAuthor) as ND1 

FROM DataTableCopy

Where PrimaryAuthor is not null and PrimaryAuthor <> 'Null'

GROUP BY PrimaryAuthor 

union all

select SecondaryAuthor, COUNT(SecondaryAuthor) as ND2 

FROM DataTableCopy

Where PrimaryAuthor is not null and SecondaryAuthor <> 'Null'

GROUP BY SecondaryAuthor)

Open in new window

0
Comment
Question by:TedgCl
  • 5
  • 3
  • 2
  • +3
15 Comments
 
LVL 142

Accepted Solution

by:
Guy Hengel [angelIII / a3] earned 400 total points
ID: 24336298
you are close:
select PrimaryAuthor, SUM(ND1) FROM

(select PrimaryAuthor, COUNT(PrimaryAuthor) as ND1 

FROM DataTableCopy

Where PrimaryAuthor is not null and PrimaryAuthor <> 'Null'

GROUP BY PrimaryAuthor 

union all

select SecondaryAuthor, COUNT(SecondaryAuthor) as ND2 

FROM DataTableCopy

Where PrimaryAuthor is not null and SecondaryAuthor <> 'Null'

GROUP BY SecondaryAuthor

) sq

GROUP BY PrimaryAuthor

Open in new window

0
 
LVL 26

Assisted Solution

by:tigin44
tigin44 earned 100 total points
ID: 24343004
use UNION instead of UNOIN ALL

SELECT Author, SUM(ND) AS TotalCount

FROM

		(

		SELECT PrimaryAuthor AS Author, COUNT(PrimaryAuthor) AS ND

		FROM DataTableCopy

		WHERE PrimaryAuthor IS NOT NULL 

		  AND PrimaryAuthor <> 'Null'

		GROUP BY PrimaryAuthor 

		UNION

		SELECT SecondaryAuthor AS Author, COUNT(SecondaryAuthor) AS ND

		FROM DataTableCopy

		WHERE SecondaryAuthor IS NOT NULL 

		  AND SecondaryAuthor <> 'Null'

		GROUP BY SecondaryAuthor

		) A

GROUP BY Author

Open in new window

0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24343015
tigin44,
 
  can you explain in how far the UNION instead of UNION ALL will help?
0
 
LVL 26

Expert Comment

by:Chris Luttrell
ID: 24343065
tigin44, et al,
"Union" instead of "Union All" is not what you want here.  Given the original example but with some possible values changed to illustrate the problem, say name2 had 100 entries as both the Primary Author and the Secondary Author then the sample logical union should look like this:
name1 | 200
name2 | 100
name1 | 300
name2 | 100
and the proper results of the query on the result of the union should end up with this
name1 | 500
name2 | 200
but the Union (without the All modifier) will only return the distinct values from both sets and will return this
name1 | 200
name2 | 100
name1 | 300
resulting in the query results being wrong and giving
name1 | 500
name2 | 100
So angelIII is of course correct in using Union All and now others can know why.
0
 
LVL 40

Expert Comment

by:mrjoltcola
ID: 25007675
Unrelatd, but I will also point out that:

>>Where PrimaryAuthor is not null and PrimaryAuthor <> 'Null'

can be rewritten as:

>>Where PrimaryAuthor <> 'Null'


Because the very presence of compare (<>) ignores null values for consideration.

While I understand the original intent was probably to say "Give me all values where PrimaryAuthor is not null, oh, oops we also have some values where the string 'Null' is, so lets filter those as well..." but in the end it is redundant.

Also I note that the accepted solution changed:

>>Where PrimaryAuthor is not null and SecondaryAuthor <> 'Null'

To:

>>WHERE SecondaryAuthor IS NOT NULL AND SecondaryAuthor <> 'Null'



Thought I assume that is correct, it is not clear as to the asker's intent. If this change is correct, then my above comment also applies to this line as well.

NULLs don't count in comparisons, likes, etc. as they are "non-existent"
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 25007824
Concur with CGLuttrell. We want to sum every instance therefore union all is the correct operation.

However, primary Author could also be secondary Author, and possibly a distinction is required.

The second query might have an incorrect check of primaryauthor is not null rather than secondaryauthor is not null, though, it is also reasonable to only count secondaryauthors if there is a primaryauthor.


Check out :


create table #datatablecopy (PrimaryAuthor varchar(10), SecondaryAuthor varchar(10))

insert #datatablecopy values ('p1','s1')

insert #datatablecopy values ('p2','p1')

insert #datatablecopy values ('p3','s2')

insert #datatablecopy values ('p4','s3')

insert #datatablecopy values ('p5','p3')
 
 

select Author, sum(nd) 

from (

SELECT PrimaryAuthor AS Author, COUNT(PrimaryAuthor) AS ND

		FROM #DataTableCopy

		WHERE PrimaryAuthor IS NOT NULL AND PrimaryAuthor <> 'Null'

		GROUP BY PrimaryAuthor 

		UNION ALL

		SELECT SecondaryAuthor AS Author, COUNT(SecondaryAuthor) AS ND

		FROM #DataTableCopy

		WHERE SecondaryAuthor IS NOT NULL AND SecondaryAuthor <> 'Null'

		GROUP BY SecondaryAuthor

) b group by author
 

-- p1 and p3 each score 2
 
 

select Author, sum(nd) 

from (

SELECT PrimaryAuthor AS Author, COUNT(PrimaryAuthor) AS ND

		FROM #DataTableCopy

		WHERE PrimaryAuthor IS NOT NULL AND PrimaryAuthor <> 'Null'

		GROUP BY PrimaryAuthor 

		UNION

		SELECT SecondaryAuthor AS Author, COUNT(SecondaryAuthor) AS ND

		FROM #DataTableCopy

		WHERE SecondaryAuthor IS NOT NULL AND SecondaryAuthor <> 'Null'

		GROUP BY SecondaryAuthor

) b group by author
 

-- p1 and p3 each score 1

Open in new window

0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 25007893
I admit, I did not see the "secondaryAuthor" vs "primaryAuthor", though I do presume it was a typo.
if it was not, fine with me.

I still have to argue over the "UNION", as it can produce wrong results.

show this test script:
create table DataTableCopy (PrimaryAuthor varchar(10), SecondaryAuthor varchar(10))

go

set nocount on

insert into DataTableCopy values ( 'data1', 'data2' )

insert into DataTableCopy values( 'data1', 'data3' )

insert into DataTableCopy values( 'data2', 'data3')

insert into DataTableCopy values( 'data2', 'data1')

insert into DataTableCopy values( 'data3', 'data1')
 

go
 

select PrimaryAuthor, COUNT(PrimaryAuthor) as ND1 

FROM DataTableCopy

Where PrimaryAuthor is not null and PrimaryAuthor <> 'Null'

GROUP BY PrimaryAuthor 
 

select SecondaryAuthor, COUNT(SecondaryAuthor) as ND1 

FROM DataTableCopy

Where SecondaryAuthor is not null and SecondaryAuthor <> 'Null'

GROUP BY SecondaryAuthor 
 

select PrimaryAuthor, SUM(ND1) FROM

(select PrimaryAuthor, COUNT(PrimaryAuthor) as ND1 

FROM DataTableCopy

Where PrimaryAuthor is not null and PrimaryAuthor <> 'Null'

GROUP BY PrimaryAuthor 

union 

select SecondaryAuthor, COUNT(SecondaryAuthor) as ND2 

FROM DataTableCopy

Where SecondaryAuthor is not null and SecondaryAuthor <> 'Null'

GROUP BY SecondaryAuthor

) sq

GROUP BY PrimaryAuthor
 
 
 

select PrimaryAuthor, SUM(ND1) FROM

(select PrimaryAuthor, COUNT(PrimaryAuthor) as ND1 

FROM DataTableCopy

Where PrimaryAuthor is not null and PrimaryAuthor <> 'Null'

GROUP BY PrimaryAuthor 

union all

select SecondaryAuthor, COUNT(SecondaryAuthor) as ND2 

FROM DataTableCopy

Where SecondaryAuthor is not null and SecondaryAuthor <> 'Null'

GROUP BY SecondaryAuthor

) sq

GROUP BY PrimaryAuthor
 
 
 
 
 

go

drop table DataTableCopy

Open in new window

0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 25007898
results:
PrimaryAuthor ND1

------------- -----------

data1         2

data2         2

data3         1
 

SecondaryAuthor ND1

--------------- -----------

data1           2

data2           1

data3           2
 

PrimaryAuthor 

------------- -----------

data1         2    <<<<< ooups <<< should be 4. UNION ate up 1 row from the subquery

data2         3 

data3         3
 

PrimaryAuthor 

------------- -----------

data1         4

data2         3

data3         3

Open in new window

0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 25008127
and if we add the extra entry of a primaryauthor being NULL as per:

insert #datatablecopy values (null,'p3')
insert #datatablecopy values (null,'p6')

and use the original primaryauthor is not null in the secondaryauthor select per angelIII's post versus change to secondaryauthor in tigin44's post, then we see even greater variations.


create table #datatablecopy (PrimaryAuthor varchar(10), SecondaryAuthor varchar(10))

insert #datatablecopy values ('p1','s1')

insert #datatablecopy values ('p2','p1')

insert #datatablecopy values ('p3','s2')

insert #datatablecopy values ('p4','s3')

insert #datatablecopy values ('p5','p3')

insert #datatablecopy values (null,'p3')

insert #datatablecopy values (null,'p6')
 

-- angelIII's post
 

select Author, sum(nd) 

from (

		SELECT PrimaryAuthor AS Author, COUNT(PrimaryAuthor) AS ND

		FROM #DataTableCopy

		WHERE PrimaryAuthor <> 'Null'

		GROUP BY PrimaryAuthor 

		UNION ALL

		SELECT SecondaryAuthor AS Author, COUNT(SecondaryAuthor) AS ND

		FROM #DataTableCopy

		WHERE PrimaryAuthor IS NOT NULL AND SecondaryAuthor <> 'Null'

		GROUP BY SecondaryAuthor

) b group by author
 

-- tigin44's
 

select Author, sum(nd) 

from (

		SELECT PrimaryAuthor AS Author, COUNT(PrimaryAuthor) AS ND

		FROM #DataTableCopy

		WHERE PrimaryAuthor IS NOT NULL AND PrimaryAuthor <> 'Null'

		GROUP BY PrimaryAuthor 

		UNION

		SELECT SecondaryAuthor AS Author, COUNT(SecondaryAuthor) AS ND

		FROM #DataTableCopy

		WHERE SecondaryAuthor IS NOT NULL AND SecondaryAuthor <> 'Null'

		GROUP BY SecondaryAuthor

) b group by author

Open in new window

0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 25008403
Didn't notice, but can now see CGLuttrell was incomplete in the second example... Should also have excluded the second instance of name1|300, but the thought was there :)

and really, why not simply do :

select Author, Count(*)
from (
      SELECT PrimaryAuthor AS Author FROM DataTableCopy WHERE PrimaryAuthor is not Null
      UNION ALL
      SELECT SecondaryAuthor FROM DataTableCopy WHERE SecondaryAuthor is not Null
) b group by author


It is quicker...
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 25011250
given the field name "issue", I consider a split between the 2 first posts fair, as both gave necessary input (although neither gave an explanation ...)
0
 

Author Comment

by:TedgCl
ID: 25029955
AngelIII I believe you are correct.  This has turned out to be a very good thread.  How do I reassign the points?
0
 

Author Closing Comment

by:TedgCl
ID: 31579453
Thanks!
0

Featured Post

VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
TSQL previous 5 25
Where to download and how to install sqldmo.dll 5 34
ms sql last 8 weeks as columns 5 28
Sql Query 4 16
This article describes how to use the timestamp of existing data in a database to allow Tableau to calculate the prior work day instead of relying on case statements or if statements to calculate the days of the week.
In this article we will get to know that how can we recover deleted data if it happens accidently. We really can recover deleted rows if we know the time when data is deleted by using the transaction log.
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now