Solved

Ms SQL Compound Statement that queries the results of a UNION

Posted on 2009-05-08
15
618 Views
Last Modified: 2012-05-06
I am trying to write a compound state that queries the results of a UNION in MS SQL 2005.  I am trying the following structure but can't get it to work:

Select XXXX
(
select YYYYYY
UNION
select ZZZZZ
)

The result of the union looks like
name1 | 200
name2 | 100
name1 | 300
name2 | 150

I want to run a query on the result of the union so I end up with
name1 | 500
name2 | 250

select PrimaryAuthor, SUM(ND1) FROM
(select PrimaryAuthor, COUNT(PrimaryAuthor) as ND1 
FROM DataTableCopy
Where PrimaryAuthor is not null and PrimaryAuthor <> 'Null'
GROUP BY PrimaryAuthor 
union all
select SecondaryAuthor, COUNT(SecondaryAuthor) as ND2 
FROM DataTableCopy
Where PrimaryAuthor is not null and SecondaryAuthor <> 'Null'
GROUP BY SecondaryAuthor)

Open in new window

0
Comment
Question by:TedgCl
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 2
  • +3
15 Comments
 
LVL 143

Accepted Solution

by:
Guy Hengel [angelIII / a3] earned 400 total points
ID: 24336298
you are close:
select PrimaryAuthor, SUM(ND1) FROM
(select PrimaryAuthor, COUNT(PrimaryAuthor) as ND1 
FROM DataTableCopy
Where PrimaryAuthor is not null and PrimaryAuthor <> 'Null'
GROUP BY PrimaryAuthor 
union all
select SecondaryAuthor, COUNT(SecondaryAuthor) as ND2 
FROM DataTableCopy
Where PrimaryAuthor is not null and SecondaryAuthor <> 'Null'
GROUP BY SecondaryAuthor
) sq
GROUP BY PrimaryAuthor

Open in new window

0
 
LVL 26

Assisted Solution

by:tigin44
tigin44 earned 100 total points
ID: 24343004
use UNION instead of UNOIN ALL

SELECT Author, SUM(ND) AS TotalCount
FROM
		(
		SELECT PrimaryAuthor AS Author, COUNT(PrimaryAuthor) AS ND
		FROM DataTableCopy
		WHERE PrimaryAuthor IS NOT NULL 
		  AND PrimaryAuthor <> 'Null'
		GROUP BY PrimaryAuthor 
		UNION
		SELECT SecondaryAuthor AS Author, COUNT(SecondaryAuthor) AS ND
		FROM DataTableCopy
		WHERE SecondaryAuthor IS NOT NULL 
		  AND SecondaryAuthor <> 'Null'
		GROUP BY SecondaryAuthor
		) A
GROUP BY Author

Open in new window

0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 24343015
tigin44,
 
  can you explain in how far the UNION instead of UNION ALL will help?
0
Tutorials alone can't teach real engineering

So we built better training tools.

-Hands-on Labs
-Instructor Mentoring
-Scenario-Based Tests
-Dedicated Cloud Servers

All at your fingertips. What are you waiting for?

 
LVL 27

Expert Comment

by:Chris Luttrell
ID: 24343065
tigin44, et al,
"Union" instead of "Union All" is not what you want here.  Given the original example but with some possible values changed to illustrate the problem, say name2 had 100 entries as both the Primary Author and the Secondary Author then the sample logical union should look like this:
name1 | 200
name2 | 100
name1 | 300
name2 | 100
and the proper results of the query on the result of the union should end up with this
name1 | 500
name2 | 200
but the Union (without the All modifier) will only return the distinct values from both sets and will return this
name1 | 200
name2 | 100
name1 | 300
resulting in the query results being wrong and giving
name1 | 500
name2 | 100
So angelIII is of course correct in using Union All and now others can know why.
0
 
LVL 40

Expert Comment

by:mrjoltcola
ID: 25007675
Unrelatd, but I will also point out that:

>>Where PrimaryAuthor is not null and PrimaryAuthor <> 'Null'

can be rewritten as:

>>Where PrimaryAuthor <> 'Null'


Because the very presence of compare (<>) ignores null values for consideration.

While I understand the original intent was probably to say "Give me all values where PrimaryAuthor is not null, oh, oops we also have some values where the string 'Null' is, so lets filter those as well..." but in the end it is redundant.

Also I note that the accepted solution changed:

>>Where PrimaryAuthor is not null and SecondaryAuthor <> 'Null'

To:

>>WHERE SecondaryAuthor IS NOT NULL AND SecondaryAuthor <> 'Null'



Thought I assume that is correct, it is not clear as to the asker's intent. If this change is correct, then my above comment also applies to this line as well.

NULLs don't count in comparisons, likes, etc. as they are "non-existent"
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 25007824
Concur with CGLuttrell. We want to sum every instance therefore union all is the correct operation.

However, primary Author could also be secondary Author, and possibly a distinction is required.

The second query might have an incorrect check of primaryauthor is not null rather than secondaryauthor is not null, though, it is also reasonable to only count secondaryauthors if there is a primaryauthor.


Check out :


create table #datatablecopy (PrimaryAuthor varchar(10), SecondaryAuthor varchar(10))
insert #datatablecopy values ('p1','s1')
insert #datatablecopy values ('p2','p1')
insert #datatablecopy values ('p3','s2')
insert #datatablecopy values ('p4','s3')
insert #datatablecopy values ('p5','p3')
 
 
select Author, sum(nd) 
from (
SELECT PrimaryAuthor AS Author, COUNT(PrimaryAuthor) AS ND
		FROM #DataTableCopy
		WHERE PrimaryAuthor IS NOT NULL AND PrimaryAuthor <> 'Null'
		GROUP BY PrimaryAuthor 
		UNION ALL
		SELECT SecondaryAuthor AS Author, COUNT(SecondaryAuthor) AS ND
		FROM #DataTableCopy
		WHERE SecondaryAuthor IS NOT NULL AND SecondaryAuthor <> 'Null'
		GROUP BY SecondaryAuthor
) b group by author
 
-- p1 and p3 each score 2
 
 
select Author, sum(nd) 
from (
SELECT PrimaryAuthor AS Author, COUNT(PrimaryAuthor) AS ND
		FROM #DataTableCopy
		WHERE PrimaryAuthor IS NOT NULL AND PrimaryAuthor <> 'Null'
		GROUP BY PrimaryAuthor 
		UNION
		SELECT SecondaryAuthor AS Author, COUNT(SecondaryAuthor) AS ND
		FROM #DataTableCopy
		WHERE SecondaryAuthor IS NOT NULL AND SecondaryAuthor <> 'Null'
		GROUP BY SecondaryAuthor
) b group by author
 
-- p1 and p3 each score 1

Open in new window

0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 25007893
I admit, I did not see the "secondaryAuthor" vs "primaryAuthor", though I do presume it was a typo.
if it was not, fine with me.

I still have to argue over the "UNION", as it can produce wrong results.

show this test script:
create table DataTableCopy (PrimaryAuthor varchar(10), SecondaryAuthor varchar(10))
go
set nocount on
insert into DataTableCopy values ( 'data1', 'data2' )
insert into DataTableCopy values( 'data1', 'data3' )
insert into DataTableCopy values( 'data2', 'data3')
insert into DataTableCopy values( 'data2', 'data1')
insert into DataTableCopy values( 'data3', 'data1')
 
go
 
select PrimaryAuthor, COUNT(PrimaryAuthor) as ND1 
FROM DataTableCopy
Where PrimaryAuthor is not null and PrimaryAuthor <> 'Null'
GROUP BY PrimaryAuthor 
 
select SecondaryAuthor, COUNT(SecondaryAuthor) as ND1 
FROM DataTableCopy
Where SecondaryAuthor is not null and SecondaryAuthor <> 'Null'
GROUP BY SecondaryAuthor 
 
select PrimaryAuthor, SUM(ND1) FROM
(select PrimaryAuthor, COUNT(PrimaryAuthor) as ND1 
FROM DataTableCopy
Where PrimaryAuthor is not null and PrimaryAuthor <> 'Null'
GROUP BY PrimaryAuthor 
union 
select SecondaryAuthor, COUNT(SecondaryAuthor) as ND2 
FROM DataTableCopy
Where SecondaryAuthor is not null and SecondaryAuthor <> 'Null'
GROUP BY SecondaryAuthor
) sq
GROUP BY PrimaryAuthor
 
 
 
select PrimaryAuthor, SUM(ND1) FROM
(select PrimaryAuthor, COUNT(PrimaryAuthor) as ND1 
FROM DataTableCopy
Where PrimaryAuthor is not null and PrimaryAuthor <> 'Null'
GROUP BY PrimaryAuthor 
union all
select SecondaryAuthor, COUNT(SecondaryAuthor) as ND2 
FROM DataTableCopy
Where SecondaryAuthor is not null and SecondaryAuthor <> 'Null'
GROUP BY SecondaryAuthor
) sq
GROUP BY PrimaryAuthor
 
 
 
 
 
go
drop table DataTableCopy

Open in new window

0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 25007898
results:
PrimaryAuthor ND1
------------- -----------
data1         2
data2         2
data3         1
 
SecondaryAuthor ND1
--------------- -----------
data1           2
data2           1
data3           2
 
PrimaryAuthor 
------------- -----------
data1         2    <<<<< ooups <<< should be 4. UNION ate up 1 row from the subquery
data2         3 
data3         3
 
PrimaryAuthor 
------------- -----------
data1         4
data2         3
data3         3

Open in new window

0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 25008127
and if we add the extra entry of a primaryauthor being NULL as per:

insert #datatablecopy values (null,'p3')
insert #datatablecopy values (null,'p6')

and use the original primaryauthor is not null in the secondaryauthor select per angelIII's post versus change to secondaryauthor in tigin44's post, then we see even greater variations.


create table #datatablecopy (PrimaryAuthor varchar(10), SecondaryAuthor varchar(10))
insert #datatablecopy values ('p1','s1')
insert #datatablecopy values ('p2','p1')
insert #datatablecopy values ('p3','s2')
insert #datatablecopy values ('p4','s3')
insert #datatablecopy values ('p5','p3')
insert #datatablecopy values (null,'p3')
insert #datatablecopy values (null,'p6')
 
-- angelIII's post
 
select Author, sum(nd) 
from (
		SELECT PrimaryAuthor AS Author, COUNT(PrimaryAuthor) AS ND
		FROM #DataTableCopy
		WHERE PrimaryAuthor <> 'Null'
		GROUP BY PrimaryAuthor 
		UNION ALL
		SELECT SecondaryAuthor AS Author, COUNT(SecondaryAuthor) AS ND
		FROM #DataTableCopy
		WHERE PrimaryAuthor IS NOT NULL AND SecondaryAuthor <> 'Null'
		GROUP BY SecondaryAuthor
) b group by author
 
-- tigin44's
 
select Author, sum(nd) 
from (
		SELECT PrimaryAuthor AS Author, COUNT(PrimaryAuthor) AS ND
		FROM #DataTableCopy
		WHERE PrimaryAuthor IS NOT NULL AND PrimaryAuthor <> 'Null'
		GROUP BY PrimaryAuthor 
		UNION
		SELECT SecondaryAuthor AS Author, COUNT(SecondaryAuthor) AS ND
		FROM #DataTableCopy
		WHERE SecondaryAuthor IS NOT NULL AND SecondaryAuthor <> 'Null'
		GROUP BY SecondaryAuthor
) b group by author

Open in new window

0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 25008403
Didn't notice, but can now see CGLuttrell was incomplete in the second example... Should also have excluded the second instance of name1|300, but the thought was there :)

and really, why not simply do :

select Author, Count(*)
from (
      SELECT PrimaryAuthor AS Author FROM DataTableCopy WHERE PrimaryAuthor is not Null
      UNION ALL
      SELECT SecondaryAuthor FROM DataTableCopy WHERE SecondaryAuthor is not Null
) b group by author


It is quicker...
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 25011250
given the field name "issue", I consider a split between the 2 first posts fair, as both gave necessary input (although neither gave an explanation ...)
0
 

Author Comment

by:TedgCl
ID: 25029955
AngelIII I believe you are correct.  This has turned out to be a very good thread.  How do I reassign the points?
0
 

Author Closing Comment

by:TedgCl
ID: 31579453
Thanks!
0

Featured Post

Instantly Create Instructional Tutorials

Contextual Guidance at the moment of need helps your employees adopt to new software or processes instantly. Boost knowledge retention and employee engagement step-by-step with one easy solution.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
Ever wondered why sometimes your SQL Server is slow or unresponsive with connections spiking up but by the time you go in, all is well? The following article will show you how to install and configure a SQL job that will send you email alerts includ…
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question