- Community Pick
- Experts Exchange Approved
- Editor's Choice
- 1
Introduction
By the end of this Article it is my intention to bring the meaning and value of the above quote to those who chose to read this whether as a beginning SQL programmer or an intermediate one inexperienced with this little tidbit of SQL syntax: conditional aggregation.
Conditional aggregation is simply the use of aggregates under one or more conditions, thus, potentially altering your results of functions like SUM(), COUNT(), etcetera in a particular column while allowing you to analyze your record set as a whole.
In other words, consider the following data:
- ""]««setup»»
To reproduce the above data, you can execute SQL similar to the below T-SQL example created for MS SQL Server.
(table structure -- create statement)
(sample data -- insert statement)
To start, if we wanted to know how many Experts are on the list we would simply use:
Now, what if we wanted to know how many female Experts are on the list?
Then we would add a WHERE conditional clause to the query:
This works, but what is the percent of female Experts to the total?
- 2
Conditional Aggregates: Basics
Without conditional aggregates, to answer the previous question you would need a secondary query to get the total to do the percentage calculation.
This code look familiar?
With conditional aggregates (which I have seen a number of questions on lately), we can get the count of female Experts in the same query as the total like so:
(MS Access IIF version -- MySQL IF syntax would be similar)
(MS Access SWITCH version) added 2010-08-24
(Oracle DECODE version) added 2010-08-24
Notice you can get a count of specific rows by using the SUM() function while using a conditional statement to add a 1 for matches and a 0 otherwise making the sum the same as the number of rows meeting your criteria.
- ""]««bonus tip»»
On some systems there is not a separate operator for Integer division (i.e., division operator acts differently based on data type of values involved); therefore, note that 100.0 is used purposefully to account for instances where this is not the case. For example, the above queries yield 20% as response. I used 100.0 so that result would be 20.0000, but could have made this 0.2000 by multiplying by 1.0. Note, however, what result you get in MS SQL if you simply omit this portion of the calculation or multiply by 1.
- 3
Conditional Aggregates: NULLIF
The basic principle on most conditional aggregates is that you include a specific column or literal when your criteria is met or you consider a value the same as null which is not considered in aggregate functions.
Consequently, this is why I omitted the else condition on my case statement as values not matching the when condition will result in null anyway. In other words, these two bits of SQL are equivalent in results (i.e., 4):
Therefore, one neat trick you can use in simple cases is to take advantage of NULLIF function if available on your SQL platform to generate nulls for values you don't want to include.
For example, this counts the 8 male Experts or isMale values not equal to 0:
This counts the number of Experts whose primary zone is not 'MSACCESS', which also happens to be 8:
- 4
PIVOTing Without Fancy Keywords or TRANSFORMations
Another nice usage of conditional aggregates is pivoting data as seen in this example question regarding cross tab queries in SQL 2000.
As seen in the linked question, we can pivot the data by the isMale column (male or female) like this:
Which isn't as cool as this MS SQL 2005 T-SQL code maybe:
But it gets the job done in most SQL platforms, with minimal tweaking based on support for CASE syntax shown or existence of IF/IIF or other helpful control flow functions, yielding results:
Hopefully it is self explanatory how this same data could have been pivoted by primary zone instead of gender. If not, like with anything else discussed here, please feel free to write me a comment below.
- 5
Conclusion
So as you can see or will find out, it is not always practical to aggregate all rows in your query based on your business need; however, you can definitely analyze a portion of the data using conditionals within your aggregates in those instances where it makes sense.
This is a very novice concept, but, given the amount of questions regarding this or some complex queries that can be simplified to use this concept, I am hoping that this article was of value to you and thank you for reading.
Until the next adventure...
Best regards and happy coding,
Kevin C. Cross, Sr. (mwvisa1)
=-=-=-=-=-=-=-=-=-=-=-=-=-
If you found this article helpful, please click the Yes button after the question just below. This will give me (the author) a few points and might encourage me to write more articles.
If you didn't or otherwise feel the need to vote No, please first leave a comment to give me a chance to answer and perhaps to improve this article.
Thank you!
=-=-=-=-=-=-=-=-=-=-=-=-=-
by: sdstuber on 2010-08-04 at 17:51:41ID: 17823
how cool, I'm sample data!!
nice article, on #2 "Oracle / MySQL IF syntax would be similar", In Oracle I think you would want CASE (as you have already illustrated) or DECODE to compare to, there is no SQL "IF".
I don't think MySQL supports an "IF" either