A SQL Tidbit: Conditional Aggregates

AID: 3527
  • Status: Published

20122 points

  • By
  • TypeTips/Tricks
  • Posted on2010-08-04 at 11:10:12
Awards
  • Community Pick
  • Experts Exchange Approved
  • Editor's Choice

As they say in love and is true in SQL: you can sum some Data some of the time, but you can't always aggregate all Data all the time!

1

Introduction


By the end of this Article it is my intention to bring the meaning and value of the above quote to those who chose to read this whether as a beginning SQL programmer or an intermediate one inexperienced with this little tidbit of SQL syntax: conditional aggregation.

Conditional aggregation is simply the use of aggregates under one or more conditions, thus, potentially altering your results of functions like SUM(), COUNT(), etcetera in a particular column while allowing you to analyze your record set as a whole.

In other words, consider the following data:

----------------+-------+------------
expert          |isMale |primaryZone
----------------+-------+------------
acperkins       |   1   | MSSQL2K5
aneeshattingal  |   1   | MSSQL2K5
angeliii        |   1   | MSSQLSVR
Helen_Feddema   |   0   | MSACCESS
matthewspatrick |   1   | SQLSYNTX
mbizup          |   0   | MSACCESS
mwvisa1         |   1   | SQLSYNTX
ScottPletcher   |   1   | MSSQLSVR
sdstuber        |   1   | ORACLEDB
slightwv        |   1   | ORACLEDB
----------------+-------+------------
                                  
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:

Select allOpen in new window



""]««setup»»
To reproduce the above data, you can execute SQL similar to the below T-SQL example created for MS SQL Server.

(table structure -- create statement)

create table SQLExperts(
   expert varchar(50) primary key, 
   isMale bit, 
   primaryZone char(8)
);
                                  
1:
2:
3:
4:
5:

Select allOpen in new window



(sample data -- insert statement)

insert into SQLExperts(expert, isMale, primaryZone)
select 'mbizup', 0, 'MSACCESS' union 
select 'Helen_Feddema', 0, 'MSACCESS' union
select 'matthewspatrick', 1, 'SQLSYNTX' union
select 'mwvisa1', 1, 'SQLSYNTX' union
select 'angeliii', 1, 'MSSQLSVR' union
select 'ScottPletcher', 1, 'MSSQLSVR' union
select 'acperkins', 1, 'MSSQL2K5' union
select 'aneeshattingal', 1, 'MSSQL2K5' union
select 'sdstuber', 1, 'ORACLEDB' union
select 'slightwv', 1, 'ORACLEDB'
;
                                  
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:

Select allOpen in new window



To start, if we wanted to know how many Experts are on the list we would simply use:

select count(*) as cnt from SQLExperts;
                                  
1:

Select allOpen in new window



Now, what if we wanted to know how many female Experts are on the list?
Then we would add a WHERE conditional clause to the query:

select count(*) as cnt 
from SQLExperts
where isMale = 0;
                                  
1:
2:
3:

Select allOpen in new window


This works, but what is the percent of female Experts to the total?

2

Conditional Aggregates: Basics


Without conditional aggregates, to answer the previous question you would need a secondary query to get the total to do the percentage calculation.

This code look familiar?

select (select count(*) from SQLExperts where isMale = 0) * 100.0 / count(*) as femaleExperts
from SQLExperts;
                                  
1:
2:

Select allOpen in new window



With conditional aggregates (which I have seen a number of questions on lately), we can get the count of female Experts in the same query as the total like so:

select count(case isMale when 0 then expert end) * 100.0 / count(*) as femaleExperts
from SQLExperts;
                                  
1:
2:

Select allOpen in new window



(MS Access IIF version -- MySQL IF syntax would be similar)

select sum(iif(isMale=0, 1, 0)) * 100.0 / count(*) as femaleExperts
from SQLExperts;
                                  
1:
2:

Select allOpen in new window



(MS Access SWITCH version) added 2010-08-24

select count(switch(isMale = 0, 1)) * 100.0 / count(*) as femaleExperts
from SQLExperts;
                                  
1:
2:

Select allOpen in new window



(Oracle DECODE version) added 2010-08-24

select sum(decode(isMale, 0, 1)) * 100.0 / count(*) as femaleExperts
from SQLExperts;
                                  
1:
2:

Select allOpen in new window



Notice you can get a count of specific rows by using the SUM() function while using a conditional statement to add a 1 for matches and a 0 otherwise making the sum the same as the number of rows meeting your criteria.

""]««bonus tip»»
On some systems there is not a separate operator for Integer division (i.e., division operator acts differently based on data type of values involved); therefore, note that 100.0 is used purposefully to account for instances where this is not the case. For example, the above queries yield 20% as response. I used 100.0 so that result would be 20.0000, but could have made this 0.2000 by multiplying by 1.0. Note, however, what result you get in MS SQL if you simply omit this portion of the calculation or multiply by 1.



3

Conditional Aggregates: NULLIF


The basic principle on most conditional aggregates is that you include a specific column or literal when your criteria is met or you consider a value the same as null which is not considered in aggregate functions.

Consequently, this is why I omitted the else condition on my case statement as values not matching the when condition will result in null anyway. In other words, these two bits of SQL are equivalent in results (i.e., 4):

select sum(case when primaryZone like 'MSSQL%' and isMale = 1 then 1 else 0 end) as cnt
from SQLExperts;
                                  
1:
2:

Select allOpen in new window

select sum(case when primaryZone like 'MSSQL%' and isMale = 1 then 1 else null end) as cnt
from SQLExperts;
                                  
1:
2:

Select allOpen in new window

select sum(case when primaryZone like 'MSSQL%' and isMale = 1 then 1 end) as cnt
from SQLExperts;
                                  
1:
2:

Select allOpen in new window



Therefore, one neat trick you can use in simple cases is to take advantage of NULLIF function if available on your SQL platform to generate nulls for values you don't want to include.

For example, this counts the 8 male Experts or isMale values not equal to 0:

select count(nullif(isMale, 0)) as cnt
from SQLExperts;
                                  
1:
2:

Select allOpen in new window



This counts the number of Experts whose primary zone is not 'MSACCESS', which also happens to be 8:

select count(nullif(primaryZone, 'MSACCESS')) as cnt
from SQLExperts;
                                  
1:
2:

Select allOpen in new window



4

PIVOTing Without Fancy Keywords or TRANSFORMations


Another nice usage of conditional aggregates is pivoting data as seen in this example question regarding cross tab queries in SQL 2000.

As seen in the linked question, we can pivot the data by the isMale column (male or female) like this:

select PrimaryZone
     , count(case isMale when 0 then expert end) as Female
     , count(case isMale when 1 then expert end) as Male
from SQLExperts
group by primaryZone;
                                  
1:
2:
3:
4:
5:

Select allOpen in new window



Which isn't as cool as this MS SQL 2005 T-SQL code maybe:

select primaryZone, [0] as Female, [1] as Male
from SQLExperts
pivot (count(expert) for isMale in ([0],[1])) pvt;
                                  
1:
2:
3:

Select allOpen in new window



But it gets the job done in most SQL platforms, with minimal tweaking based on support for CASE syntax shown or existence of IF/IIF or other helpful control flow functions, yielding results:

CondAgg-04-Pivot.PNG
  • 5 KB
  • Pivoted Data Results
Pivoted Data Results



Hopefully it is self explanatory how this same data could have been pivoted by primary zone instead of gender. If not, like with anything else discussed here, please feel free to write me a comment below.

5

Conclusion


So as you can see or will find out, it is not always practical to aggregate all rows in your query based on your business need; however, you can definitely analyze a portion of the data using conditionals within your aggregates in those instances where it makes sense.

This is a very novice concept, but, given the amount of questions regarding this or some complex queries that can be simplified to use this concept, I am hoping that this article was of value to you and thank you for reading.

Until the next adventure...

Best regards and happy coding,

Kevin C. Cross, Sr. (mwvisa1)

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
If you found this article helpful, please click the Yes button after the question just below. This will give me (the author) a few points and might encourage me to write more articles.

If you didn't or otherwise feel the need to vote No, please first leave a comment to give me a chance to answer and perhaps to improve this article.

Thank you!
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 

    Asked On
    2010-08-04 at 11:10:12ID3527
    Tags

    SQL

    ,

    Conditional Aggregates

    ,

    MS SQL

    ,

    MS Access

    ,

    Oracle

    ,

    MySQL

    ,

    Database

    ,

    SUM

    ,

    COUNT

    ,

    CASE

    ,

    IF

    ,

    IIF

    Topic

    SQL Query Syntax

    Views
    3378

    Comments

    Expert Comment

    by: sdstuber on 2010-08-04 at 17:51:41ID: 17823

    how cool,  I'm sample data!!

    nice article,  on #2  "Oracle / MySQL IF syntax would be similar",   In Oracle I think you would want CASE (as you have already illustrated) or DECODE to compare to, there is no SQL "IF".  
    I don't think MySQL supports an "IF" either

    Author Comment

    by: mwvisa1 on 2010-08-04 at 17:56:19ID: 17824

    Thanks, Sean!
    For some reason I thought Oracle had IFfunction like MySQL:
    http://dev.mysql.com/doc/refman/5.0/en/control-flow-functions.html#function_if

    I figured showing the CASE would be more universal, but was trying to show alternatives. I like DECODE and SWITCH (MS Access) also and consider showing those as examples, so glad you brought that up here for readers.

    Thanks again for the confirmation on not being able to use IF inline with Oracle.

    Regards,
    Kevin

    Author Comment

    by: mwvisa1 on 2010-08-04 at 18:37:10ID: 17826

    Thanks again, Sean. I updated article to reflect more correctly IF in MySQL would look similar to MS Access IIF.

    For those who embrace the use of IF/THEN with no ELSE (utilizing aggregate treatment of NULLs), note that this also works while we are on the topic of IIF:

    select count( iif(isMale = 0, 1) ) from SQLExperts;
                                          
    1:
    

    Select allOpen in new window



    Additionally, I added two more examples of DECODE and SWITCH since they are a bit different as syntax.

    decode(expression, search, result [,search, result]...[,default] )
                                          
    1:
    

    Select allOpen in new window


    switch(expression1, value1, expression2, value2, ... expression_n, value_n )
                                          
    1:
    

    Select allOpen in new window



    At least for SWITCH() for the default case, you can use a final expression of the boolean value TRUE like so:

    switch(isMale = 0, 1, true, 0)
                                          
    1:
    

    Select allOpen in new window



    Since in our cases NULL or no default case is fine, we may not need an alternative to this in DECODE; however, as a side note, Sean is the preferred method simply to combine this with COALESCE?

    coalesce(decode(isMale, 0, 'Female'), 'Male')
                                          
    1:
    

    Select allOpen in new window



    Anyway, thanks all for reading and hopefully this is now more accurate and my apologies to those I may have had chasing around looking for an IF function in Oracle. *smile*

    Regards,

    Kevin

    Expert Comment

    by: sdstuber on 2010-08-04 at 18:37:22ID: 17827

    no problem, and thank you for the correction on MySQL.  It's obviously not my strong suit (yet) I'll take all the pointers I can get there.


    Another interesting conditional function in Oracle is LNNVL.  
    It allows comparisons with NULL's (which DECODE also allows)
     Here's an example I use in my IOUG presentation

    The last 2 columns returned are just for academic interest.  LNNVL returns a boolean which isn't normally allowed in Oracle SQL, but is here but is treated as a NULL if
    returned directly instead of being nested within another condition.

    SELECT a,
           b,
           CASE WHEN LNNVL(a = b) THEN 1 ELSE 0 END "lnnvl(a=b)",
           CASE WHEN LNNVL(a != b) THEN 1 ELSE 0 END "lnnvl(a!=b)",
           LNNVL(a = b),
           LNNVL(a != b)
      FROM (SELECT TO_NUMBER(NULL) a, TO_NUMBER(NULL) b FROM DUAL
            UNION ALL
            SELECT TO_NUMBER(NULL) a, 1 b FROM DUAL
            UNION ALL
            SELECT 1 a, TO_NUMBER(NULL) b FROM DUAL
            UNION ALL
            SELECT 1 a, 1 b FROM DUAL
            UNION ALL
            SELECT 1 a, 2 b FROM DUAL);

    Author Comment

    by: mwvisa1 on 2010-08-04 at 18:42:11ID: 17829

    That is pretty cool and good to add to the toolkit. Always nice having conversations with folks like yourself. If I haven't mentioned it lately, I love Experts Exchange!

    Author Comment

    by: mwvisa1 on 2010-08-04 at 18:44:49ID: 17830

    And never mind my question at http:#c17826 as I just remembered (*sigh* even noted it above) that decode has in its syntax a default result.

    So for those following along, you can just do:

    decode(isMale, 0, 'Female', 'Male')
                                          
    1:
    

    Select allOpen in new window


    or

    select sum( decode(isMale, 0, 1, 0) ) from SQLExperts;
                                          
    1:
    

    Select allOpen in new window

    Expert Comment

    by: sdstuber on 2010-08-04 at 18:49:53ID: 17831

    oops, looks like we were simul-posting.

    as for your last question on coalesce(decode(isMale, 0, 'Female'), 'Male')

    I would simply include the "else" or "default" in the decode rather than forcing a NULL and then using COALESCE or NVL to convert it to Male


    select isMale,coalesce(decode(isMale, 0, 'Female'), 'Male') will_work, decode(isMale,0,'Female','Male') preferred
    from (select 0 isMale from dual union all select 1 isMale from dual union all select null isMale from dual)

    Expert Comment

    by: sdstuber on 2010-08-04 at 18:50:18ID: 17832

    oops again.  yes, you found it...  :)

    Author Comment

    by: mwvisa1 on 2010-08-04 at 19:07:54ID: 17833

    Thanks!

    On an inverse or complimentary topic, since we have been talking about filtering data during aggregation, it is nice to note that if you want to include NULLs rows some of the tools mentioned by Sean can come in very handy.

    Consider taking an average of sales over 5 years. If the NULL rows are year 1 and year 2, it seems almost intuitive that AVG() function answer for sales of 5, 10 and 15 is correct at 10.

    select avg(sales)
    from (
       select 1 as yr, cast(null as int) as sales
       union select 2, null
       union select 3, 5
       union select 4, 10
       union select 5, 15
    ) t
                                          
    1:
    2:
    3:
    4:
    5:
    6:
    7:
    8:
    

    Select allOpen in new window



    Consider now though how you would want to treat NULL rows in year 3 and year 4 with sales in year 1, 2 and 5. Some businesses may want to get a true 5 year average treating the nulls as 0.

    That is nicely acheived using NVL, NZ, COALESCE or other such functions shorthanding the syntax of CASE WHEN some_value IS NULL THEN ... END.

    See how this change alters the above average:

    avg( coalesce(sales, 0) )
                                          
    1:
    

    Select allOpen in new window

     

    Kevin

    Expert Comment

    by: mark_wills on 2010-08-05 at 00:04:08ID: 17851

    and it is a yes from me too :)

    Expert Comment

    by: DineshKammath on 2010-08-09 at 22:58:44ID: 17973


    select count(*) from SQLExperts where ismale=0

    Author Comment

    by: mwvisa1 on 2010-08-10 at 06:17:50ID: 17982

    @DineshKammath:

    Thanks for the comment, but I am failing to see its point? That query was presented in the article content above ... what did you mean to say here or did you mean to post elsewhere?

    Kevin

    Author Comment

    by: mwvisa1 on 2010-08-11 at 16:45:45ID: 18058

    A nice treatment of the Switch function recently published:
    Using the Switch Function in Microsoft Access

    Thought it might interest those here since we touched briefly on Switch.

    Kevin

    Author Comment

    by: mwvisa1 on 2010-08-16 at 20:35:12ID: 18264

    Addendum: Conditional Aggregate PIVOTing In Action


    In a recent question, MattDuPlessis, presented a good scenario for demonstrating cross tabulation using conditional aggregation in the question Group and Transpose data.

    The Request
    I have data in a table in the following format: [{see table 1.1}]...I would like to group and transpose it as follows: [{see table 1.2}].

    ""]Table 1.1: Original Data

    ID        key            value
    1         name           Peter
    1         surname        Parker
    1         identity       12345
    2         name           Mark
    2         surname        Manners
    2         identity       54321
                                          
    1:
    2:
    3:
    4:
    5:
    6:
    7:
    

    Select allOpen in new window



    Table 1.2: Desired Results

    ID         name         surname    identity
    1          Peter        Parker     12345
    2          Mark         Manners    54321
                                          
    1:
    2:
    3:
    

    Select allOpen in new window



    The Conditional Aggregate Solution

    select ID
         , max(case [key] when 'name' then value end) as [Name]
         , max(case [key] when 'surname' then value end) as [Surname]
         , max(case [key] when 'identity' then value end) as [Identity]
    from your_table_name
    group by id
    ;
                                          
    1:
    2:
    3:
    4:
    5:
    6:
    7:
    

    Select allOpen in new window



    I thought this was a good example to share as I focused in a lot on SUM() and COUNT() in the article as that is typically where we see conditionals during aggregation. MAX() and other aggregates make sense too at times, so just good to see others in action. In this scenario, we are pivoting data of probably the VarChar variety as it would need to be something along those lines to store both numeric and string data. Since you wouldn't typically SUM() string data, MAX() comes to mind but is especially appropriate since the expectation here is that there is only one instance of each key per ID; therefore, keep in mind this could easily also be shown with MIN().

    Thanks again for reading!

    Regards,
    Kevin

    Author Comment

    by: mwvisa1 on 2010-09-20 at 11:41:18ID: 19641

    Addendum: Be Fearful of Nothing



    In section #3 (NULLIF), we discussed how the following were equivalent in their results.

    select sum(case when primaryZone like 'MSSQL%' then 1 else 0 end)
    from SQLExperts;
                                          
    1:
    2:
    

    Select allOpen in new window

    select sum(case when primaryZone like 'MSSQL%' then 1 else null end)
    from SQLExperts;
                                          
    1:
    2:
    

    Select allOpen in new window

    select sum(case when primaryZone like 'MSSQL%' then 1 end)
    from SQLExperts;
                                          
    1:
    2:
    

    Select allOpen in new window



    A point that came up recently, that is good also to note here, is that this is not always true.  In most cases when doing conditional aggregates, the basis of the approach is that you expect at least one row to meet your criteria or are using COUNT() to tally the number of rows that do.  COUNT() will return 0 if all row data counted is NULL; however, just be mindful that other aggregate functions, like SUM() or AVG(), may return NULL in the same scenario.

    So to refine the above, those three statements have the same results as long as you have at least one row meeting the conditional in the aggregate.  Otherwise, you will get 0 for the first and NULL for the other two.

    Subsequently, this subtle difference is typically fine and often desired when dealing with pivoting data, at least in my experience, as it is often helpful for us to see where we had no sales or other activity (i.e., NULL sum) versus a net 0 month (i.e., sales but returns or credits that offset those sales).  However, I saw a case recently where the conditional sum was used later in a math equation and NULL + 5 = NULL when using the standard arithmetic operators which is unlike the SUM() of two rows with data of 5 and NULL, respectively.


    Hopefully that was clear and a useful addition to our tidbit on conditional aggregates.

    Thanks again for reading and voting,

    Kevin

    Expert Comment

    by: mbizup on 2010-12-08 at 05:42:18ID: 21948

    Kevin,

    Great article!  

    A few suggestions:

    1. In general, it is a good practice to avoid embedding VBA functions in your Access queries, and it usually can be done.  They slow things down (Domain Aggregate functions are the most notorious for this).

    I haven't actually benchmarked  this query using iif (which I don't think would have nearly the same performance hit as a domain aggregate function):

    select sum(iif(isMale=0, 1, 0)) * 100.0 / count(*)
    from SQLExperts;
                                          
    1:
    2:
    

    Select allOpen in new window



    But it can be written equivalently without embedded VBA as:

    SELECT Sum(-1 * Not IsMale)*100/Count(*) AS [Female Experts]
    FROM SQLExperts;
                                          
    1:
    2:
    

    Select allOpen in new window



    2. "NULLIF function if available on your SQL platform to generate nulls ..."
    Do you want to include an alternative for other platforms as a sidenote?

    select sum(-1 * (primaryZone <> 'MSACCESS')) as cnt
    from SQLExperts;
                                          
    1:
    2:
    

    Select allOpen in new window



    3.  Under the Pivot section, this is an option for systems (such as Access) that don't support CASE:

    select PrimaryZone
         , Sum(-1 * Not [IsMale]) as Female
         , Sum(-1 * [IsMale]) as Male
    from SQLExperts
    group by primaryZone;
     
                                          
    1:
    2:
    3:
    4:
    5:
    6:
    

    Select allOpen in new window

    Author Comment

    by: mwvisa1 on 2010-12-14 at 07:52:56ID: 22062

    Thanks, Miriam, that is great feedback!

    Add your Comment

    Please Sign up or Log in to comment on this article.

    Loading Advertisement...

    Top Query Syntax Experts

    1. sdstuber

      57,920

      Master

      0 points yesterday

      Profile
      Rank: Genius
    2. mwvisa1

      43,736

      20 points yesterday

      Profile
      Rank: Genius
    3. capricorn1

      36,200

      0 points yesterday

      Profile
      Rank: Savant
    4. matthewspatrick

      35,100

      0 points yesterday

      Profile
      Rank: Savant
    5. mbizup

      25,668

      2,000 points yesterday

      Profile
      Rank: Genius
    6. awking00

      18,956

      0 points yesterday

      Profile
      Rank: Genius
    7. slightwv

      17,200

      0 points yesterday

      Profile
      Rank: Genius
    8. ralmada

      14,300

      0 points yesterday

      Profile
      Rank: Genius
    9. Cluskitt

      13,400

      0 points yesterday

      Profile
      Rank: Wizard
    10. Ray_Paseur

      13,300

      0 points yesterday

      Profile
      Rank: Savant
    11. fyed

      13,268

      0 points yesterday

      Profile
      Rank: Genius
    12. pratima_mcs

      12,500

      0 points yesterday

      Profile
      Rank: Genius
    13. jogos

      12,332

      0 points yesterday

      Profile
      Rank: Sage
    14. angelIII

      11,700

      1,000 points yesterday

      Profile
      Rank: Elite
    15. knightEknight

      11,552

      0 points yesterday

      Profile
      Rank: Genius
    16. acperkins

      11,320

      0 points yesterday

      Profile
      Rank: Genius
    17. Kdo

      10,268

      0 points yesterday

      Profile
      Rank: Genius
    18. lludden

      10,200

      0 points yesterday

      Profile
      Rank: Wizard
    19. dqmq

      10,000

      0 points yesterday

      Profile
      Rank: Genius
    20. anujnb

      9,634

      1,000 points yesterday

      Profile
      Rank: Guru
    21. LSMConsulting

      9,100

      0 points yesterday

      Profile
      Rank: Savant
    22. TempDBA

      9,052

      400 points yesterday

      Profile
      Rank: Sage
    23. dtodd

      8,868

      0 points yesterday

      Profile
      Rank: Genius
    24. Lowfatspread

      8,336

      0 points yesterday

      Profile
      Rank: Genius
    25. peter57r

      8,200

      0 points yesterday

      Profile
      Rank: Savant

    Hall Of Fame