- Community Pick
- Experts Exchange Approved
- Editor's Choice
- 1
Introduction
All major SQL dialects allow for a Count aggregate function, and Access's native Jet SQL is no exception. As in other database platforms, Access's Count() (as well as the "domain aggregate" DCount()) function returns the number of records pertaining to a grouped result set and meeting the criteria specified in the query (in the case of Count()) or in DCount() in the third argument of the function.
Consider these example data:
In this case, it is very simple to get a count of items for each Category:
Suppose, however, that instead of getting the count of each item for a given category, you instead wanted to know the count of distinct items for each category. SQL Server's Transact SQL and some other SQL dialects support the Count(Distinct ColumnName) syntax:
However, Access's Jet SQL does not support this syntax. This article will demonstrate three techniques for computing distinct counts. The first approach uses two queries for each distinct count to be performed. The second uses sub-queries to accomplish the same thing in one SQL statement. Lastly, the third uses Visual Basic for Applications (VBA) code to create a user-defined function, thus simplifying the operation further but at the cost of some execution speed.
- 2
Distinct Counts Using 'Two-Step' Queries
One way to generate distinct counts in Access is to utilize what I refer to as a "two-step" process:
- First you create a "Step1" query that contains all of the distinct rows defined by the columns that you want to count and the columns you wish to group on
- Then you create a "Step2" query that performs the aggregations using the "Step1" query as its source
Using the simple example from Section 1, to determine the count of distinct values in the Category column using the two-step process, we would first create a "Step1" query such as:
If we save that as "Step1", we would then get out final result by creating a query that uses Step1 as its source:
Using the same example, if we wanted to know the count of distinct Num values for each Category, we could use this SQL for Step1:
The SQL for Step2 then becomes:
In the attached sample file (see Section 5), you will see four sample queries that demonstrate this technique:
- Approach1_AllAccounts_St
ep1 and Approach1_AllAccounts_Ste p2, distinct count of Account values
- Approach1_ProjectsByAcct
_Step1 and Approach1_ProjectsByAcct_ Step2, distinct count of projects for each Account
- Approach1_TasksAndUsersB
yProject_S tep1 and Approach1_TasksAndUsersBy Project_St ep2, distinct count of task/user combinations for each project
- Approach1_UsersByAcctAnd
Project_St ep1 and Approach1_UsersByAcctAndP roject_Ste p2, distinct count of users for each account/project combination
- 3
Distinct Counts Using Sub-queries
In addition to the "two-step" process described above, you can generate distinct counts from a single SQL statement by using sub-queries. The sub-query defines a derived table that has only distinct rows in it, and then the main query performs the aggregation on that derived table. That derived table should include in its SELECT clause all of the columns for which you want a distinct count, as well as all of the columns that you wish to group on.
Indeed, that is exactly what Access is doing implicitly in the "two-step" approach outlined in Section 2: when Access runs the "Step2" query, it is in effect treating the corresponding "Step1" query as a derived table.
Using the simple example from Section 1, suppose we needed to determine the count of distinct values in the Category column. To determine that using a sub-query:
Using the same example, if we wanted to know the count of distinct Num values for each Category, we could use:
Please note that if you enter SQL similar to the above into Access's query designer, after saving the query Access will adjust the SQL statement so that they will look more like this:
Note: In the SQL examples using the sub-query approach, you may have noticed that I used an alias ("z") for the derived table returned by the sub-query. While aliases are usually optional and are thus used mainly to improve your SQL's readability, in this case Jet SQL actually does require the alias. Jet will auto-create an alias if needed, but it can only handle one; therefore, it is always a best practice to declare your own alias when using sub-queries so you have full knowledge and control.
In the attached sample file (see Section 5), you will see four sample queries that demonstrate this technique:
- Approach2_AllAccounts, distinct count of Account values
- Approach2_ProjectsByAcct
, distinct count of projects for each Account
- Approach2_TasksAndUsersB
yProject, distinct count of task/user combinations for each project
- Approach2_UsersByAcctAnd
Project, distinct count of users for each account/project combination
- 4
DCountDistinct Function
While you can generate distinct counts using "two-step" queries (Section 2) or sub-queries (Section 3), you can also accomplish this by using VBA to create a user-defined function. Thus, I have created the DCountDistinct() function. This function provides similar functionality to the native DCount() domain aggregate function, but instead of counting all rows from the specified result set, DCountDistinct returns the number of distinct entries meeting the specified criteria.
The source code for DCountDistinct is as follows:
DCountDistinct takes the following arguments:
- CountCols indicates the columns for which you want to count distinct values. To pass more than one column, use a comma to delimit the column names. You must place the column names in square brackets if the column names do not meet the usual requirements for database object names
- Tbl is the name of the table or query from which you are counting distinct items
- Criteria is an optional argument for specifying criteria to apply to the results. I recommend placing all column names in square brackets. Be sure to use single-quotes as text qualifiers and # as date qualifiers. Use logical And and/or Or to pass multiple conditions
To use this function in your project, simply add the code to a regular VBA module, and make sure to set a reference to the latest version of Microsoft Data Access Objects (DAO) installed on your computer. Please note that using DCountDistinct() will probably be slower than using the "two-step" or sub-query approaches explored in Sections 2 and 3, but in small to medium data sets you are unlikely to notice the difference, and the convenience the function offers may outweigh the increased runtime.
Using the simple example from Section 1, we can determine the count of distinct values in the Category column using DCountDistinct():
Using the same example, if we wanted to know the count of distinct Num values for each Category, we could use:
In the attached sample file (see Section 5), you will see four sample queries that demonstrate this technique:
- Approach3_AllAccounts, distinct count of Account values
- Approach3_ProjectsByAcct
, distinct count of projects for each Account
- Approach3_TasksAndUsersB
yProject, distinct count of task/user combinations for each project
- Approach3_UsersByAcctAnd
Project, distinct count of users for each account/project combination
- 5
Sample File
The attached sample file is a Microsoft Access 2003-format database containing a single table, "Sample", along with several queries demonstrating the count distinct approaches described above. The file's VBA Project also contains the source code for the DCountDistinct function.
=-=-=-=-=-=-=-=-=-=-=-=-=-
If you liked this article and want to see more from this author, please click here.
If you found this article helpful, please click the Yes button near the:
Was this article helpful?
label that is just below and to the right of this text. Thanks!
=-=-=-=-=-=-=-=-=-=-=-=-=-
by: mwvisa1 on 2010-02-08 at 16:17:42ID: 9369
Patrick:
Very nice article!
Seeing a number of these questions in the SQL topic areas, I am certain this will come in handy for those new to count(distinct) and possibly MS Access. As someone who was familiar with this previously, I was still thrilled to get a nice new function in the DCountDistinct() which gives a nice alternative in Jet SQL to sub-queries or multiple-step processes. As with the other domain functions, it seems well worth the minor overhead of adding VBA to the mix. :)
Thank you very much!
Voted a big YES above.
Respectfully yours,
Kevin