Calculating Distinct Counts in Access

AID: 2417
  • Status: Published

20858 points

Awards
  • Community Pick
  • Experts Exchange Approved
  • Editor's Choice
by Patrick G. Matthews

1

Introduction



All major SQL dialects allow for a Count aggregate function, and Access's native Jet SQL is no exception.  As in other database platforms, Access's Count() (as well as the "domain aggregate" DCount()) function returns the number of records pertaining to a grouped result set and meeting the criteria specified in the query (in the case of Count()) or in DCount() in the third argument of the function.

Consider these example data:

Category  Num
-------------
x         1
x         1
x         2
x         3
y         1
y         1
z         1
z         1
z         2
                                    
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:

Select allOpen in new window



In this case, it is very simple to get a count of items for each Category:

SELECT Category, Count(*) AS Items
FROM SomeTable
GROUP BY Category

returns:
Category  Num
-------------
x         4
y         2
z         3
                                    
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:

Select allOpen in new window



Suppose, however, that instead of getting the count of each item for a given category, you instead wanted to know the count of distinct items for each category.  SQL Server's Transact SQL and some other SQL dialects support the Count(Distinct ColumnName) syntax:

SELECT Category, Count(Distinct Num) AS DistinctItems
FROM SomeTable
GROUP BY Category

returns:
Category  Num
-------------
x         3
y         1
z         2
                                    
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:

Select allOpen in new window



However, Access's Jet SQL does not support this syntax.  This article will demonstrate three techniques for computing distinct counts.  The first approach uses two queries for each distinct count to be performed.  The second uses sub-queries to accomplish the same thing in one SQL statement.  Lastly, the third uses Visual Basic for Applications (VBA) code to create a user-defined function, thus simplifying the operation further but at the cost of some execution speed.

2

Distinct Counts Using 'Two-Step' Queries



One way to generate distinct counts in Access is to utilize what I refer to as a "two-step" process:
  • First you create a "Step1" query that contains all of the distinct rows defined by the columns that you want to count and the columns you wish to group on

  • Then you create a "Step2" query that performs the aggregations using the "Step1" query as its source


Using the simple example from Section 1, to determine the count of distinct values in the Category column using the two-step process, we would first create a "Step1" query such as:

SELECT Category
FROM SomeTable
GROUP BY Category
                                    
1:
2:
3:

Select allOpen in new window



If we save that as "Step1", we would then get out final result by creating a query that uses Step1 as its source:

SELECT Count(*) AS DistinctCategories
FROM Step1
                                    
1:
2:

Select allOpen in new window



Using the same example, if we wanted to know the count of distinct Num values for each Category, we could use this SQL for Step1:

SELECT Category, Num
FROM SomeTable
GROUP BY Category, Num
                                    
1:
2:
3:

Select allOpen in new window



The SQL for Step2 then becomes:

SELECT Category, Count(Num) AS DistinctNum
FROM Step1
GROUP BY Category
                                    
1:
2:
3:

Select allOpen in new window



In the attached sample file (see Section 5), you will see four sample queries that demonstrate this technique:
  • Approach1_AllAccounts_Step1 and Approach1_AllAccounts_Step2, distinct count of Account values

  • Approach1_ProjectsByAcct_Step1 and Approach1_ProjectsByAcct_Step2, distinct count of projects for each Account

  • Approach1_TasksAndUsersByProject_Step1 and Approach1_TasksAndUsersByProject_Step2, distinct count of task/user combinations for each project

  • Approach1_UsersByAcctAndProject_Step1 and Approach1_UsersByAcctAndProject_Step2, distinct count of users for each account/project combination


3

Distinct Counts Using Sub-queries



In addition to the "two-step" process described above, you can generate distinct counts from a single SQL statement by using sub-queries.  The sub-query defines a derived table that has only distinct rows in it, and then the main query performs the aggregation on that derived table.  That derived table should include in its SELECT clause all of the columns for which you want a distinct count, as well as all of the columns that you wish to group on.

Indeed, that is exactly what Access is doing implicitly in the "two-step" approach outlined in Section 2: when Access runs the "Step2" query, it is in effect treating the corresponding "Step1" query as a derived table.

Using the simple example from Section 1, suppose we needed to determine the count of distinct values in the Category column.  To determine that using a sub-query:

SELECT Count(z.Category) AS DistinctItems
FROM
    (SELECT Category
    FROM SomeTable
    GROUP BY Category) AS z
                                    
1:
2:
3:
4:
5:

Select allOpen in new window



Using the same example, if we wanted to know the count of distinct Num values for each Category, we could use:

SELECT z.Category, Count(z.Num)
FROM
    (SELECT s.Category, s.Num
    FROM SomeTable s
    GROUP BY s.Category, s.Num) AS z
GROUP BY z.Category
                                    
1:
2:
3:
4:
5:
6:

Select allOpen in new window



Please note that if you enter SQL similar to the above into Access's query designer, after saving the query Access will adjust the SQL statement so that they will look more like this:

SELECT Count(z.Category) AS DistinctItems
FROM [SELECT Category FROM SomeTable GROUP BY Category]. AS z

SELECT z.Category, Count(z.Num)
FROM [SELECT s.Category, s.Num FROM SomeTable s GROUP BY s.Category, s.Num]. AS z
GROUP BY z.Category
                                    
1:
2:
3:
4:
5:
6:

Select allOpen in new window



Note: In the SQL examples using the sub-query approach, you may have noticed that I used an alias ("z") for the derived table returned by the sub-query.  While aliases are usually optional and are thus used mainly to improve your SQL's readability, in this case Jet SQL actually does require the alias.  Jet will auto-create an alias if needed, but it can only handle one; therefore, it is always a best practice to declare your own alias when using sub-queries so you have full knowledge and control.

In the attached sample file (see Section 5), you will see four sample queries that demonstrate this technique:
  • Approach2_AllAccounts, distinct count of Account values

  • Approach2_ProjectsByAcct, distinct count of projects for each Account

  • Approach2_TasksAndUsersByProject, distinct count of task/user combinations for each project

  • Approach2_UsersByAcctAndProject, distinct count of users for each account/project combination


4

DCountDistinct Function



While you can generate distinct counts using "two-step" queries (Section 2) or sub-queries (Section 3), you can also accomplish this by using VBA to create a user-defined function.  Thus, I have created the DCountDistinct() function.  This function provides similar functionality to the native DCount() domain aggregate function, but instead of counting all rows from the specified result set, DCountDistinct returns the number of distinct entries meeting the specified criteria.

The source code for DCountDistinct is as follows:

Function DCountDistinct(CountCols As String, Tbl As String, Optional Criteria As String = "")
    
    ' Function by Patrick G. Matthews
    
    ' Requires reference to Microsoft DAO library
    
    ' This function provides "domain aggregate"-type functionality similar to
    ' COUNT(DISTINCT ColumnName) in Transact-SQL (i.e., SQL Server's dialect).  The function
    ' can be used in queries, forms, and reports in the Access UI and/or in VBA code once added
    ' to a regular VBA module
    
    ' CountColumns is a comma-delimited list of columns for which we are counting the distinct
    '   values (or combinations of values if 2+ columns are specified).  Place field names in
    '   square brackets if they do not meet the customary rules for naming DB objects
    ' Tbl is the table/query the data are pulled from.  Place table name in square brackets
    '   if they do not meet the customary rules for naming DB objects
    ' Criteria (optional) are the criteria to be applied in the grouping.  Be sure to use And
    '   or Or as needed to build the right logic, and to encase text values in single quotes
    '   and dates in #
    
    ' Just like DCount, if there are no items found matching the criteria, DCountDistinct
    ' returns zero
    
    Dim rs As DAO.Recordset
    Dim SQL As String
    
    ' Enable error handler
    
    On Error GoTo ErrHandler
    
    ' Build query that counts the number of distinct items (or combinations of items, if
    ' CountCols specifies 2+ columns) that meet the specified criteria.  If no criteria are
    ' specified, then count goes against entire table
    
    SQL = "SELECT Count(1) AS Result " & _
        "FROM (" & _
            "SELECT DISTINCT " & CountCols & " " & _
            "FROM " & Tbl & " " & _
            IIf(Criteria <> "", "WHERE " & Criteria, "") & _
        ")"
        
    ' Open recordset with result
    
    Set rs = CurrentDb.OpenRecordset(SQL)
    
    ' Set return value
    
    DCountDistinct = rs!Result
    
    rs.Close
    
    GoTo Cleanup
    
ErrHandler:
    DCountDistinct = CVErr(Err.Number)
    
Cleanup:
    Set rs = Nothing
    
End Function
                                    
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
60:

Select allOpen in new window



DCountDistinct takes the following arguments:
  • CountCols indicates the columns for which you want to count distinct values.  To pass more than one column, use a comma to delimit the column names.  You must place the column names in square brackets if the column names do not meet the usual requirements for database object names

  • Tbl is the name of the table or query from which you are counting distinct items

  • Criteria is an optional argument for specifying criteria to apply to the results.  I recommend placing all column names in square brackets.  Be sure to use single-quotes as text qualifiers and # as date qualifiers.  Use logical And and/or Or to pass multiple conditions


To use this function in your project, simply add the code to a regular VBA module, and make sure to set a reference to the latest version of Microsoft Data Access Objects (DAO) installed on your computer.  Please note that using DCountDistinct() will probably be slower than using the "two-step" or sub-query approaches explored in Sections 2 and 3, but in small to medium data sets you are unlikely to notice the difference, and the convenience the function offers may outweigh the increased runtime.

Using the simple example from Section 1, we can determine the count of distinct values in the Category column using DCountDistinct():

SELECT DCountDistinct("Category", "SomeTable") AS DistinctItems
                                    
1:

Select allOpen in new window



Using the same example, if we wanted to know the count of distinct Num values for each Category, we could use:

SELECT Category, DCountDistinct("Num", "SomeTable", "[Category] = '" & Category & "'")
FROM SomeTable
GROUP BY Category
                                    
1:
2:
3:

Select allOpen in new window



In the attached sample file (see Section 5), you will see four sample queries that demonstrate this technique:
  • Approach3_AllAccounts, distinct count of Account values

  • Approach3_ProjectsByAcct, distinct count of projects for each Account

  • Approach3_TasksAndUsersByProject, distinct count of task/user combinations for each project

  • Approach3_UsersByAcctAndProject, distinct count of users for each account/project combination


5

Sample File



DCountDistinct.mdb
  • 432 KB
  • Sample database
DCountDistinct.mdb


The attached sample file is a Microsoft Access 2003-format database containing a single table, "Sample", along with several queries demonstrating the count distinct approaches described above.  The file's VBA Project also contains the source code for the DCountDistinct function.


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
If you liked this article and want to see more from this author,  please click here.

If you found this article helpful, please click the Yes button near the:

      Was this article helpful?

label that is just below and to the right of this text.   Thanks!
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    Asked On
    2010-02-08 at 10:38:13ID2417
    Tags

    Access

    ,

    Query

    ,

    Queries

    ,

    SQL

    ,

    Distinct

    ,

    Count

    ,

    Distinct Count

    ,

    VBA

    ,

    Visual Basic

    ,

    Aggregate

    ,

    Domain Aggregate

    Topic

    Microsoft Access Database

    Views
    12765

    Comments

    Expert Comment

    by: mwvisa1 on 2010-02-08 at 16:17:42ID: 9369

    Patrick:

    Very nice article!

    Seeing a number of these questions in the SQL topic areas, I am certain this will come in handy for those new to count(distinct) and possibly MS Access.  As someone who was familiar with this previously, I was still thrilled to get a nice new function in the DCountDistinct() which gives a nice alternative in Jet SQL to sub-queries or multiple-step processes.  As with the other domain functions, it seems well worth the minor overhead of adding VBA to the mix. :)

    Thank you very much!
    Voted a big YES above.

    Respectfully yours,

    Kevin

    Expert Comment

    by: david251 on 2010-02-25 at 08:31:47ID: 10074

    Great Article!  Thanks

    -David251

    Expert Comment

    by: csehz on 2011-09-01 at 23:25:25ID: 31055

    Patrick thanks for this article very much, your sample database was the biggest help to me to apply in my work

    Add your Comment

    Please Sign up or Log in to comment on this article.

    Join Experts Exchange Today

    Gain Access to all our Tech Resources

    Get personalized answers

    Ask unlimited questions

    Access Proven Solutions

    Search 3.2 million solutions

    Read In-Depth How-To Guides

    1000+ articles, demos, & tips

    Watch Step by Step Tutorials

    Learn direct from top tech pros

    And Much More!

    Your complete tech resource

    See Plans and Pricing

    30-day free trial. Register in 60 seconds.

    Loading Advertisement...

    Top MS Access Experts

    1. mbizup

      784,072

      Sage

      4,520 points yesterday

      Profile
      Rank: Genius
    2. capricorn1

      766,094

      Sage

      10,500 points yesterday

      Profile
      Rank: Savant
    3. boag2000

      656,789

      Sage

      6,500 points yesterday

      Profile
      Rank: Genius
    4. LSMConsulting

      447,337

      Wizard

      1,000 points yesterday

      Profile
      Rank: Savant
    5. fyed

      441,791

      Wizard

      1,510 points yesterday

      Profile
      Rank: Genius
    6. DatabaseMX

      341,349

      Wizard

      1,500 points yesterday

      Profile
      Rank: Savant
    7. JDettman

      274,883

      Guru

      2,510 points yesterday

      Profile
      Rank: Genius
    8. peter57r

      259,954

      Guru

      0 points yesterday

      Profile
      Rank: Savant
    9. als315

      222,728

      Guru

      6,000 points yesterday

      Profile
      Rank: Genius
    10. matthewspatrick

      157,448

      Guru

      3,610 points yesterday

      Profile
      Rank: Savant
    11. Helen_Feddema

      125,149

      Master

      0 points yesterday

      Profile
      Rank: Genius
    12. imnorie

      118,132

      Master

      600 points yesterday

      Profile
      Rank: Genius
    13. danishani

      106,613

      Master

      0 points yesterday

      Profile
      Rank: Wizard
    14. cactus_data

      85,952

      Master

      1,200 points yesterday

      Profile
      Rank: Genius
    15. TheHiTechCoach

      80,124

      Master

      0 points yesterday

      Profile
      Rank: Sage
    16. dqmq

      77,066

      Master

      1,500 points yesterday

      Profile
      Rank: Genius
    17. harfang

      74,385

      Master

      50 points yesterday

      Profile
      Rank: Genius
    18. Nick67

      59,053

      Master

      0 points yesterday

      Profile
      Rank: Sage
    19. Sudonim

      49,486

      0 points yesterday

      Profile
      Rank: Wizard
    20. pteranodon72

      45,520

      2,000 points yesterday

      Profile
      Rank: Wizard
    21. aikimark

      43,748

      2,000 points yesterday

      Profile
      Rank: Genius
    22. IrogSinta

      37,564

      1,500 points yesterday

      Profile
    23. TechMommy

      35,330

      70 points yesterday

      Profile
      Rank: Master
    24. BillDenver

      31,954

      0 points yesterday

      Profile
      Rank: Guru
    25. hnasr

      31,316

      0 points yesterday

      Profile
      Rank: Genius

    Hall Of Fame