<

A SQL Tidbit: Conditional Aggregates

Published on
27,999 Points
8,499 Views
20 Endorsements
Last Modified:
Awarded
As they say in love and is true in SQL: you can sum some Data some of the time, but you can't always aggregate all Data all the time!

1. Introduction

By the end of this Article it is my intention to bring the meaning and value of the above quote to those who chose to read this whether as a beginning SQL programmer or an intermediate one inexperienced with this little tidbit of SQL syntax: conditional aggregation.

Conditional aggregation is simply the use of aggregates under one or more conditions, thus, potentially altering your results of functions like SUM(), COUNT(), etcetera in a particular column while allowing you to analyze your record set as a whole.

In other words, consider the following data:
----------------+-------+------------
expert          |isMale |primaryZone
----------------+-------+------------
acperkins       |   1   | MSSQL2K5
aneeshattingal  |   1   | MSSQL2K5
angeliii        |   1   | MSSQLSVR
Helen_Feddema   |   0   | MSACCESS
matthewspatrick |   1   | SQLSYNTX
mbizup          |   0   | MSACCESS
mwvisa1         |   1   | SQLSYNTX
ScottPletcher   |   1   | MSSQLSVR
sdstuber        |   1   | ORACLEDB
slightwv        |   1   | ORACLEDB
----------------+-------+------------

Open in new window


««setup»»
To reproduce the above data, you can execute SQL similar to the below T-SQL example created for MS SQL Server.
(table structure -- create statement)
create table SQLExperts(
   expert varchar(50) primary key, 
   isMale bit, 
   primaryZone char(8)
);

Open in new window


(sample data -- insert statement)
insert into SQLExperts(expert, isMale, primaryZone)
select 'mbizup', 0, 'MSACCESS' union 
select 'Helen_Feddema', 0, 'MSACCESS' union
select 'matthewspatrick', 1, 'SQLSYNTX' union
select 'mwvisa1', 1, 'SQLSYNTX' union
select 'angeliii', 1, 'MSSQLSVR' union
select 'ScottPletcher', 1, 'MSSQLSVR' union
select 'acperkins', 1, 'MSSQL2K5' union
select 'aneeshattingal', 1, 'MSSQL2K5' union
select 'sdstuber', 1, 'ORACLEDB' union
select 'slightwv', 1, 'ORACLEDB'
;

Open in new window


To start, if we wanted to know how many Experts are on the list we would simply use:
select count(*) as cnt from SQLExperts;

Open in new window


Now, what if we wanted to know how many female Experts are on the list?
Then we would add a WHERE conditional clause to the query:
select count(*) as cnt 
from SQLExperts
where isMale = 0;

Open in new window

This works, but what is the percent of female Experts to the total?

2. Conditional Aggregates: Basics

Without conditional aggregates, to answer the previous question you would need a secondary query to get the total to do the percentage calculation.

This code look familiar?
select (select count(*) from SQLExperts where isMale = 0) * 100.0 / count(*) as femaleExperts
from SQLExperts;

Open in new window


With conditional aggregates (which I have seen a number of questions on lately), we can get the count of female Experts in the same query as the total like so:
select count(case isMale when 0 then expert end) * 100.0 / count(*) as femaleExperts
from SQLExperts;

Open in new window


(MS Access IIF version -- MySQL IF syntax would be similar)
select sum(iif(isMale=0, 1, 0)) * 100.0 / count(*) as femaleExperts
from SQLExperts;

Open in new window


(MS Access SWITCH version) added 2010-08-24
select count(switch(isMale = 0, 1)) * 100.0 / count(*) as femaleExperts
from SQLExperts;

Open in new window


(Oracle DECODE version) added 2010-08-24
select sum(decode(isMale, 0, 1)) * 100.0 / count(*) as femaleExperts
from SQLExperts;

Open in new window


Notice you can get a count of specific rows by using the SUM() function while using a conditional statement to add a 1 for matches and a 0 otherwise making the sum the same as the number of rows meeting your criteria.

««bonus tip»»
On some systems there is not a separate operator for Integer division (i.e., division operator acts differently based on data type of values involved); therefore, note that 100.0 is used purposefully to account for instances where this is not the case. For example, the above queries yield 20% as response. I used 100.0 so that result would be 20.0000, but could have made this 0.2000 by multiplying by 1.0. Note, however, what result you get in MS SQL if you simply omit this portion of the calculation or multiply by 1.

3. Conditional Aggregates: NULLIF

The basic principle on most conditional aggregates is that you include a specific column or literal when your criteria is met or you consider a value the same as null which is not considered in aggregate functions.

Consequently, this is why I omitted the else condition on my case statement as values not matching the when condition will result in null anyway. In other words, these two bits of SQL are equivalent in results (i.e., 4):
select sum(case when primaryZone like 'MSSQL%' and isMale = 1 then 1 else 0 end) as cnt
from SQLExperts;

Open in new window

select sum(case when primaryZone like 'MSSQL%' and isMale = 1 then 1 else null end) as cnt
from SQLExperts;

Open in new window

select sum(case when primaryZone like 'MSSQL%' and isMale = 1 then 1 end) as cnt
from SQLExperts;

Open in new window


Therefore, one neat trick you can use in simple cases is to take advantage of NULLIF function if available on your SQL platform to generate nulls for values you don't want to include.

For example, this counts the 8 male Experts or isMale values not equal to 0:
select count(nullif(isMale, 0)) as cnt
from SQLExperts;

Open in new window


This counts the number of Experts whose primary zone is not 'MSACCESS', which also happens to be 8:
select count(nullif(primaryZone, 'MSACCESS')) as cnt
from SQLExperts;

Open in new window


4. PIVOTing Without Fancy Keywords or TRANSFORMations

Another nice usage of conditional aggregates is pivoting data as seen in this example question regarding cross tab queries in SQL 2000.

As seen in the linked question, we can pivot the data by the isMale column (male or female) like this:
select PrimaryZone
     , count(case isMale when 0 then expert end) as Female
     , count(case isMale when 1 then expert end) as Male
from SQLExperts
group by primaryZone;

Open in new window


Which isn't as cool as this MS SQL 2005 T-SQL code maybe:
select primaryZone, [0] as Female, [1] as Male
from SQLExperts
pivot (count(expert) for isMale in ([0],[1])) pvt;

Open in new window


But it gets the job done in most SQL platforms, with minimal tweaking based on support for CASE syntax shown or existence of IF/IIF or other helpful control flow functions, yielding results:
Pivoted Data Results
Hopefully it is self explanatory how this same data could have been pivoted by primary zone instead of gender. If not, like with anything else discussed here, please feel free to write me a comment below.

5. Conclusion

So as you can see or will find out, it is not always practical to aggregate all rows in your query based on your business need; however, you can definitely analyze a portion of the data using conditionals within your aggregates in those instances where it makes sense.

This is a very novice concept, but, given the amount of questions regarding this or some complex queries that can be simplified to use this concept, I am hoping that this article was of value to you and thank you for reading.

Until the next adventure...

Best regards and happy coding,

Kevin C. Cross, Sr. (mwvisa1)

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
If you found this article helpful, please click the Yes button after the question just below. This will give me (the author) a few points and might encourage me to write more articles.

If you didn't or otherwise feel the need to vote No, please first leave a comment to give me a chance to answer and perhaps to improve this article.

Thank you!
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
20
Comment
Author:Kevin Cross
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
17 Comments
 
LVL 74

Expert Comment

by:sdstuber
how cool,  I'm sample data!!

nice article,  on #2  "Oracle / MySQL IF syntax would be similar",   In Oracle I think you would want CASE (as you have already illustrated) or DECODE to compare to, there is no SQL "IF".  
I don't think MySQL supports an "IF" either
0
 
LVL 60

Author Comment

by:Kevin Cross
Thanks, Sean!
For some reason I thought Oracle had IFfunction like MySQL:
http://dev.mysql.com/doc/refman/5.0/en/control-flow-functions.html#function_if

I figured showing the CASE would be more universal, but was trying to show alternatives. I like DECODE and SWITCH (MS Access) also and consider showing those as examples, so glad you brought that up here for readers.

Thanks again for the confirmation on not being able to use IF inline with Oracle.

Regards,
Kevin
0
 
LVL 60

Author Comment

by:Kevin Cross
Thanks again, Sean. I updated article to reflect more correctly IF in MySQL would look similar to MS Access IIF.

For those who embrace the use of IF/THEN with no ELSE (utilizing aggregate treatment of NULLs), note that this also works while we are on the topic of IIF:
select count( iif(isMale = 0, 1) ) from SQLExperts;

Open in new window


Additionally, I added two more examples of DECODE and SWITCH since they are a bit different as syntax.
decode(expression, search, result [,search, result]...[,default] )

Open in new window

switch(expression1, value1, expression2, value2, ... expression_n, value_n )

Open in new window


At least for SWITCH() for the default case, you can use a final expression of the boolean value TRUE like so:
switch(isMale = 0, 1, true, 0)

Open in new window


Since in our cases NULL or no default case is fine, we may not need an alternative to this in DECODE; however, as a side note, Sean is the preferred method simply to combine this with COALESCE?
coalesce(decode(isMale, 0, 'Female'), 'Male')

Open in new window



Anyway, thanks all for reading and hopefully this is now more accurate and my apologies to those I may have had chasing around looking for an IF function in Oracle. *smile*

Regards,

Kevin
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 74

Expert Comment

by:sdstuber
no problem, and thank you for the correction on MySQL.  It's obviously not my strong suit (yet) I'll take all the pointers I can get there.


Another interesting conditional function in Oracle is LNNVL.  
It allows comparisons with NULL's (which DECODE also allows)
 Here's an example I use in my IOUG presentation

The last 2 columns returned are just for academic interest.  LNNVL returns a boolean which isn't normally allowed in Oracle SQL, but is here but is treated as a NULL if
returned directly instead of being nested within another condition.

SELECT a,
       b,
       CASE WHEN LNNVL(a = b) THEN 1 ELSE 0 END "lnnvl(a=b)",
       CASE WHEN LNNVL(a != b) THEN 1 ELSE 0 END "lnnvl(a!=b)",
       LNNVL(a = b),
       LNNVL(a != b)
  FROM (SELECT TO_NUMBER(NULL) a, TO_NUMBER(NULL) b FROM DUAL
        UNION ALL
        SELECT TO_NUMBER(NULL) a, 1 b FROM DUAL
        UNION ALL
        SELECT 1 a, TO_NUMBER(NULL) b FROM DUAL
        UNION ALL
        SELECT 1 a, 1 b FROM DUAL
        UNION ALL
        SELECT 1 a, 2 b FROM DUAL);
0
 
LVL 60

Author Comment

by:Kevin Cross
That is pretty cool and good to add to the toolkit. Always nice having conversations with folks like yourself. If I haven't mentioned it lately, I love Experts Exchange!
0
 
LVL 60

Author Comment

by:Kevin Cross
And never mind my question at http:#c17826 as I just remembered (*sigh* even noted it above) that decode has in its syntax a default result.

So for those following along, you can just do:
decode(isMale, 0, 'Female', 'Male')

Open in new window

or
select sum( decode(isMale, 0, 1, 0) ) from SQLExperts;

Open in new window

0
 
LVL 74

Expert Comment

by:sdstuber
oops, looks like we were simul-posting.

as for your last question on coalesce(decode(isMale, 0, 'Female'), 'Male')

I would simply include the "else" or "default" in the decode rather than forcing a NULL and then using COALESCE or NVL to convert it to Male


select isMale,coalesce(decode(isMale, 0, 'Female'), 'Male') will_work, decode(isMale,0,'Female','Male') preferred
from (select 0 isMale from dual union all select 1 isMale from dual union all select null isMale from dual)
0
 
LVL 74

Expert Comment

by:sdstuber
oops again.  yes, you found it...  :)
0
 
LVL 60

Author Comment

by:Kevin Cross
Thanks!

On an inverse or complimentary topic, since we have been talking about filtering data during aggregation, it is nice to note that if you want to include NULLs rows some of the tools mentioned by Sean can come in very handy.

Consider taking an average of sales over 5 years. If the NULL rows are year 1 and year 2, it seems almost intuitive that AVG() function answer for sales of 5, 10 and 15 is correct at 10.

select avg(sales)
from (
   select 1 as yr, cast(null as int) as sales
   union select 2, null
   union select 3, 5
   union select 4, 10
   union select 5, 15
) t

Open in new window


Consider now though how you would want to treat NULL rows in year 3 and year 4 with sales in year 1, 2 and 5. Some businesses may want to get a true 5 year average treating the nulls as 0.

That is nicely acheived using NVL, NZ, COALESCE or other such functions shorthanding the syntax of CASE WHEN some_value IS NULL THEN ... END.

See how this change alters the above average:
avg( coalesce(sales, 0) )

Open in new window

 

Kevin
0
 
LVL 51

Expert Comment

by:Mark Wills
and it is a yes from me too :)
0
 

Expert Comment

by:DineshKammath

select count(*) from SQLExperts where ismale=0
0
 
LVL 60

Author Comment

by:Kevin Cross
@DineshKammath:

Thanks for the comment, but I am failing to see its point? That query was presented in the article content above ... what did you mean to say here or did you mean to post elsewhere?

Kevin
0
 
LVL 60

Author Comment

by:Kevin Cross
A nice treatment of the Switch function recently published:
Using the Switch Function in Microsoft Access

Thought it might interest those here since we touched briefly on Switch.

Kevin
0
 
LVL 60

Author Comment

by:Kevin Cross

Addendum: Conditional Aggregate PIVOTing In Action

In a recent question, MattDuPlessis, presented a good scenario for demonstrating cross tabulation using conditional aggregation in the question Group and Transpose data.

The Request
I have data in a table in the following format: [{see table 1.1}]...I would like to group and transpose it as follows: [{see table 1.2}].
[step=""]Table 1.1: Original Data
ID        key            value
1         name           Peter
1         surname        Parker
1         identity       12345
2         name           Mark
2         surname        Manners
2         identity       54321

Open in new window


Table 1.2: Desired Results
ID         name         surname    identity
1          Peter        Parker     12345
2          Mark         Manners    54321

Open in new window

[/step]

The Conditional Aggregate Solution
select ID
     , max(case [key] when 'name' then value end) as [Name]
     , max(case [key] when 'surname' then value end) as [Surname]
     , max(case [key] when 'identity' then value end) as [Identity]
from your_table_name
group by id
;

Open in new window


I thought this was a good example to share as I focused in a lot on SUM() and COUNT() in the article as that is typically where we see conditionals during aggregation. MAX() and other aggregates make sense too at times, so just good to see others in action. In this scenario, we are pivoting data of probably the VarChar variety as it would need to be something along those lines to store both numeric and string data. Since you wouldn't typically SUM() string data, MAX() comes to mind but is especially appropriate since the expectation here is that there is only one instance of each key per ID; therefore, keep in mind this could easily also be shown with MIN().

Thanks again for reading!

Regards,
Kevin
0
 
LVL 60

Author Comment

by:Kevin Cross

Addendum: Be Fearful of Nothing


In section #3 (NULLIF), we discussed how the following were equivalent in their results.
select sum(case when primaryZone like 'MSSQL%' then 1 else 0 end)
from SQLExperts;

Open in new window

select sum(case when primaryZone like 'MSSQL%' then 1 else null end)
from SQLExperts;

Open in new window

select sum(case when primaryZone like 'MSSQL%' then 1 end)
from SQLExperts;

Open in new window


A point that came up recently, that is good also to note here, is that this is not always true.  In most cases when doing conditional aggregates, the basis of the approach is that you expect at least one row to meet your criteria or are using COUNT() to tally the number of rows that do.  COUNT() will return 0 if all row data counted is NULL; however, just be mindful that other aggregate functions, like SUM() or AVG(), may return NULL in the same scenario.

So to refine the above, those three statements have the same results as long as you have at least one row meeting the conditional in the aggregate.  Otherwise, you will get 0 for the first and NULL for the other two.

Subsequently, this subtle difference is typically fine and often desired when dealing with pivoting data, at least in my experience, as it is often helpful for us to see where we had no sales or other activity (i.e., NULL sum) versus a net 0 month (i.e., sales but returns or credits that offset those sales).  However, I saw a case recently where the conditional sum was used later in a math equation and NULL + 5 = NULL when using the standard arithmetic operators which is unlike the SUM() of two rows with data of 5 and NULL, respectively.


Hopefully that was clear and a useful addition to our tidbit on conditional aggregates.

Thanks again for reading and voting,

Kevin
0
 
LVL 61

Expert Comment

by:mbizup
Kevin,

Great article!  

A few suggestions:

1. In general, it is a good practice to avoid embedding VBA functions in your Access queries, and it usually can be done.  They slow things down (Domain Aggregate functions are the most notorious for this).

I haven't actually benchmarked  this query using iif (which I don't think would have nearly the same performance hit as a domain aggregate function):

select sum(iif(isMale=0, 1, 0)) * 100.0 / count(*)
from SQLExperts;

Open in new window



But it can be written equivalently without embedded VBA as:

SELECT Sum(-1 * Not IsMale)*100/Count(*) AS [Female Experts]
FROM SQLExperts;

Open in new window



2. "NULLIF function if available on your SQL platform to generate nulls ..."
Do you want to include an alternative for other platforms as a sidenote?

select sum(-1 * (primaryZone <> 'MSACCESS')) as cnt
from SQLExperts;

Open in new window



3.  Under the Pivot section, this is an option for systems (such as Access) that don't support CASE:
select PrimaryZone
     , Sum(-1 * Not [IsMale]) as Female
     , Sum(-1 * [IsMale]) as Male
from SQLExperts
group by primaryZone;
 

Open in new window

0
 
LVL 60

Author Comment

by:Kevin Cross
Thanks, Miriam, that is great feedback!
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Join & Write a Comment

In this brief tutorial Pawel from AdRem Software explains how you can quickly find out which services are running on your network, or what are the IP addresses of servers responsible for each service. Software used is freeware NetCrunch Tools (https…
Do you want to know how to make a graph with Microsoft Access? First, create a query with the data for the chart. Then make a blank form and add a chart control. This video also shows how to change what data is displayed on the graph as well as form…

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month