As they say in love and is true in SQL: you can sum some Data some of the time, but you can't always aggregate all Data all the time!
By the end of this Article it is my intention to bring the meaning and value of the above quote to those who chose to read this whether as a beginning SQL programmer or an intermediate one inexperienced with this little tidbit of SQL syntax: conditional aggregation.
Conditional aggregation is simply the use of aggregates under one or more conditions, thus, potentially altering your results of functions like SUM()
, etcetera in a particular column while allowing you to analyze your record set as a whole.
In other words, consider the following data:
expert |isMale |primaryZone
acperkins | 1 | MSSQL2K5
aneeshattingal | 1 | MSSQL2K5
angeliii | 1 | MSSQLSVR
Helen_Feddema | 0 | MSACCESS
matthewspatrick | 1 | SQLSYNTX
mbizup | 0 | MSACCESS
mwvisa1 | 1 | SQLSYNTX
ScottPletcher | 1 | MSSQLSVR
sdstuber | 1 | ORACLEDB
slightwv | 1 | ORACLEDB
To reproduce the above data, you can execute SQL similar to the below T-SQL example created for MS SQL Server.
(table structure -- create statement)
create table SQLExperts(
expert varchar(50) primary key,
(sample data -- insert statement)
insert into SQLExperts(expert, isMale, primaryZone)
select 'mbizup', 0, 'MSACCESS' union
select 'Helen_Feddema', 0, 'MSACCESS' union
select 'matthewspatrick', 1, 'SQLSYNTX' union
select 'mwvisa1', 1, 'SQLSYNTX' union
select 'angeliii', 1, 'MSSQLSVR' union
select 'ScottPletcher', 1, 'MSSQLSVR' union
select 'acperkins', 1, 'MSSQL2K5' union
select 'aneeshattingal', 1, 'MSSQL2K5' union
select 'sdstuber', 1, 'ORACLEDB' union
select 'slightwv', 1, 'ORACLEDB'
To start, if we wanted to know how many Experts are on the list we would simply use:
select count(*) as cnt from SQLExperts;
Now, what if we wanted to know how many female
Experts are on the list?
Then we would add a WHERE
conditional clause to the query:
select count(*) as cnt
where isMale = 0;
, but what is the percent of female Experts to the total?
2. Conditional Aggregates: Basics
Without conditional aggregates, to answer the previous question you would need a secondary query to get the total to do the percentage calculation.
This code look familiar?
select (select count(*) from SQLExperts where isMale = 0) * 100.0 / count(*) as femaleExperts
With conditional aggregates (which I have seen a number of questions on lately), we can get the count of female Experts in the same query as the total like so:
select count(case isMale when 0 then expert end) * 100.0 / count(*) as femaleExperts
(MS Access IIF version -- MySQL IF syntax would be similar)
select sum(iif(isMale=0, 1, 0)) * 100.0 / count(*) as femaleExperts
(MS Access SWITCH version) added 2010-08-24
select count(switch(isMale = 0, 1)) * 100.0 / count(*) as femaleExperts
(Oracle DECODE version) added 2010-08-24
select sum(decode(isMale, 0, 1)) * 100.0 / count(*) as femaleExperts
Notice you can get a count of specific rows by using the SUM() function while using a conditional statement to add a 1 for matches and a 0 otherwise making the sum the same as the number of rows meeting your criteria.
On some systems there is not a separate operator for Integer division (i.e., division operator acts differently based on data type of values involved); therefore, note that 100.0 is used purposefully to account for instances where this is not the case. For example, the above queries yield 20% as response. I used 100.0 so that result would be 20.0000, but could have made this 0.2000 by multiplying by 1.0. Note, however, what result you get in MS SQL if you simply omit this portion of the calculation or multiply by 1.
3. Conditional Aggregates: NULLIF
The basic principle on most conditional aggregates is that you include a specific column or literal when your criteria is met or you consider a value the same as null which is not considered in aggregate functions.
Consequently, this is why I omitted the else
condition on my case statement as values not matching the when condition will result in null anyway. In other words, these two bits of SQL are equivalent in results (i.e., 4):
select sum(case when primaryZone like 'MSSQL%' and isMale = 1 then 1 else 0 end) as cnt
select sum(case when primaryZone like 'MSSQL%' and isMale = 1 then 1 else null end) as cnt
select sum(case when primaryZone like 'MSSQL%' and isMale = 1 then 1 end) as cnt
Therefore, one neat trick you can use in simple cases is to take advantage of NULLIF
function if available on your SQL platform to generate nulls for values you don't want to include.
For example, this counts the 8 male
Experts or isMale values not equal to 0:
select count(nullif(isMale, 0)) as cnt
This counts the number of Experts whose primary zone is not 'MSACCESS', which also happens to be 8:
select count(nullif(primaryZone, 'MSACCESS')) as cnt
4. PIVOTing Without Fancy Keywords or TRANSFORMations
Another nice usage of conditional aggregates is pivoting data as seen in this example question regarding cross tab queries in SQL 2000
As seen in the linked question, we can pivot the data by the isMale column (male or female) like this:
, count(case isMale when 0 then expert end) as Female
, count(case isMale when 1 then expert end) as Male
group by primaryZone;
Which isn't as cool as this MS SQL 2005 T-SQL code maybe:
select primaryZone,  as Female,  as Male
pivot (count(expert) for isMale in (,)) pvt;
But it gets the job done in most SQL platforms, with minimal tweaking based on support for CASE syntax shown or existence of IF/IIF or other helpful control flow functions, yielding results:
Hopefully it is self explanatory how this same data could have been pivoted by primary zone instead of gender. If not, like with anything else discussed here, please feel free to write me a comment below.
So as you can see or will find out, it is not always practical to aggregate all rows in your query based on your business need; however, you can definitely analyze a portion of the data using conditionals within your aggregates in those instances where it makes sense.
This is a very novice concept, but, given the amount of questions regarding this or some complex queries that can be simplified to use this concept, I am hoping that this article was of value to you and thank you for reading.
Until the next adventure...
Best regards and happy coding,
Kevin C. Cross, Sr. (mwvisa1
If you found this article helpful
, please click the Yes
button after the question just below. This will give me (the author) a few points and might encourage me to write more articles.
If you didn't or otherwise feel the need to vote No, please first leave a comment to give me a chance to answer and perhaps to improve this article.