Community Pick: Many members of our community have endorsed this article.
Editor's Choice: This article has been selected by our editors as an exceptional contribution.

Creating and Using a Tally Table in Access

aikimarkGet vaccinated; Social distance; Wear a mask
CERTIFIED EXPERT
Published:

Introduction

This article introduces the reader to the concept of a table (or query) of numbers, commonly called a Tally Table.  I detail different methods to create and populate a Tally Table in Access and show the new table used as part of a problem solution.

For years I've used an OneRow table as a part of my Access solutions, eliminating many duplicate values in a very efficient manner.  In the past couple of years, I have seen a similar special table in SQL Server and MySQL articles.  I have now participated in an EE Access question whose solution lent itself to a Tally Table.

Definition: A Tally Table is a record source (database table or query) that is a contiguous sequence of integer values in consecutive rows.  This is sometimes called a number table.

Why use a Tally Table: There are many situations where you need a sequence of values.  While it is easy to write loops in VBA and TSQL, there are no looping statements in Access SQL syntax.  While a Tally Table is a sequence of integer values, such values can be transformed into other Access/VBA data types (character, floating point, date, etc.)

If you are interested in using a Tally Table for a non-Access database, start by reading these articles.
SQL Server - https://www.experts-exchange.com/A_1221.html
SQL Server - http://www.sqlservercentral.com/articles/Tally+Table/72993
MySQL - https://www.experts-exchange.com/A_3573.html
SQLite - https://www.experts-exchange.com/A_3570.html


_____________________________________

Creating the Tally Table

Since an Access Select statement requires a From clause, we can't use the TSQL method of creating the number sequence strictly within the stored procedure, with no physical storage in a table:
Select 1
                      Union All Select 2
                      Union All Select 3

Open in new window

Once we have some minimal sequence in a table, we can use table joining 'magic' (matrix product growth) in a query to produce a much larger sequence of numbers.  I will show you a trick to create a query-only source of numbers, but I do a little cheating with my From clause.

Note: In the following code examples, I am starting the numbering sequence after highest current number, using 1 if the table is empty.  This is an efficiency consideration -- appending new numbers rather than trying to repopulate the entire table.  Alternatively, this code can be changed to delete all the rows before populating the (now) empty table.  


_____________________________________

Create the TallyTable with code

Public Sub MakeTallyTable(Optional parmLimit As Long = 1000, Optional parmPopulate As Boolean = False)
                        Dim lngLoop As Long
                        Dim td As TableDef
                        Dim fd As Field
                        Dim vID As Variant
                        Const cCommitInterval As Long = 500
                        
                        'Create the Tally Table if it doesn't already exist.
                        On Error Resume Next
                        vID = DBEngine(0)(0).TableDefs("TallyTable").Name
                        If vID = "TallyTable" Then
                        Else
                          Set td = DBEngine(0)(0).CreateTableDef("TallyTable")
                          Set fd = New Field
                          fd.Name = "ID"
                          fd.Type = dbLong
                          td.Fields.Append fd
                          DBEngine(0)(0).TableDefs.Append td
                        End If
                      
                      '************************
                      'Optional population of the table
                        If parmPopulate Then
                          vID = DMax("ID", "TallyTable")
                          If IsNull(vID) Then
                            vID = 0
                          End If
                          If vID < parmLimit Then
                            DBEngine(0).BeginTrans
                            For lngLoop = vID + 1 To parmLimit
                              DBEngine(0)(0).Execute "Insert Into TallyTable (ID) Values (" & lngLoop & ")"
                              If (lngLoop Mod cCommitInterval) = 0 Then
                                DBEngine(0).CommitTrans
                                DBEngine(0).BeginTrans
                              End If
                            Next
                            DBEngine(0).CommitTrans
                          End If
                        End If
                      '************************
                      End Sub

Open in new window



_____________________________________

Create the TallyTable with SQL

Create Table TallyTable2
                        (ID Long)

Open in new window

Note: You can also rewrite the first 'populating query' below, converting them into a Make Table query.

_____________________________________

Populating the Tally Table with code

Public Sub PopulateTallyTable(Optional parmLimit As Long = 1000)
                        Dim lngLoop As Long
                        Const cCommitInterval As Long = 500
                        
                        vID = DMax("ID", "TallyTable")
                        If IsNull(vID) Then
                          vID = 0
                        End If
                        If vID < parmLimit Then
                          DBEngine(0).BeginTrans
                          For lngLoop = vID + 1 To parmLimit
                            DBEngine(0)(0).Execute "Insert Into TallyTable (ID) Values (" & lngLoop & ")"
                            If (lngLoop Mod cCommitInterval) = 0 Then
                              DBEngine(0).CommitTrans
                              DBEngine(0).BeginTrans
                            End If
                          Next
                          DBEngine(0).CommitTrans
                        End If
                      End Sub

Open in new window



_____________________________________

Populating the Tally Table with SQL

When using queries to generate a number sequence, there still must be a table with one or more rows.  To keep this example simple, I'm populating my source table with ten numbers (1-10).  You can use a non-empty system table if you don't already have a non-empty table.

Note: The following SQL will not run, as shown, in Access97 since it doesn't support the From (Select...) form of SQL.  Access97 developers can work around that limitation by saving the Unioned Select statements as a separate query and then include the name of that query in your From clause.

I'm using the MSysObjects table in this example.
QBTallyTableBase10 query:
SELECT DISTINCT ID
                      FROM (SELECT Top 1  1  As ID FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  2   FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  3  FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  4  FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  5  FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  6  FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  7  FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  8  FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  9  FROM MSysObjects
                        UNION ALL 
                      SELECT Top 1 10 FROM MSysObjects);

Open in new window


However, there is a performance penalty for generating all the numbers from such 'query magic'.  I created a TallyTable2 with the same ten numbers and compare the performance of 1K and 1M number retrievals below.
Populate an already defined TallyTable2 from the magic query.
Insert Into TallyTable2 (ID)
                      Select ID From QBTallyTableBase10

Open in new window


Note: You can also create and populate the TallyTable2 with a Make-Table form of the query:

Make-Table Example 1 -- Single query
SELECT DISTINCT ID INTO TallyTable2
                      FROM (SELECT Top 1  1  As ID FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  2   FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  3  FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  4  FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  5  FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  6  FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  7  FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  8  FROM MSysObjects
                        UNION ALL
                      SELECT Top 1  9  FROM MSysObjects
                        UNION ALL 
                      SELECT Top 1  10 FROM MSysObjects);

Open in new window


Make-Table Example 2 -- Using QBTallyTableBase10
SELECT QBTallyTableBase10.ID INTO TallyTable2
                      FROM QBTallyTableBase10;

Open in new window


Now that we have a small set of (ten) numbers, we use the multiplicative property of table joins to produce much larger sets.
QueryBasedTallyTable_BIG -- 1M numbers using 10-row table
SELECT ((QBTT6.ID-1)*100000)+((QBTT5.ID-1)*10000)+((QBTT4.ID-1)*1000)+((QBTT3.ID-1)*100)+((QBTT2.ID-1)*10)+QBTT.ID AS ID
                      FROM TallyTable2 AS QBTT, TallyTable2 AS QBTT2, TallyTable2 AS QBTT3, TallyTable2 AS QBTT4, TallyTable2 AS QBTT5, TallyTable2 AS QBTT6;

Open in new window


QueryBasedTallyTable -- 1K numbers using 10-row table
SELECT ((QBTT3.ID-1)*100)+((QBTT2.ID-1)*10)+QBTT.ID AS ID
                      FROM TallyTable2 AS QBTT, TallyTable2 AS QBTT2, TallyTable2 AS QBTT3;

Open in new window


QueryBasedTallyTable_BIG_AllQry -- 1M numbers using only QBTallyTableBase10 query data
SELECT ((TT6.ID-1)*100000)+((TT5.ID-1)*10000)+((TT4.ID-1)*1000)+((TT3.ID-1)*100)+((TT2.ID-1)*10)+TT.ID AS ID
                      FROM QBTallyTableBase10 AS TT, QBTallyTableBase10 AS TT2, QBTallyTableBase10 AS TT3, QBTallyTableBase10 AS TT4, QBTallyTableBase10 AS TT5, QBTallyTableBase10 AS TT6;

Open in new window


QueryBasedTallyTable_AllQry -- 1K numbers using only QBTallyTableBase10 query data
SELECT DISTINCT ((TT3.ID-1)*100)+((TT2.ID-1)*10)+TT.ID AS ID
                      FROM QBTallyTableBase10 AS TT, QBTallyTableBase10 AS TT2, QBTallyTableBase10 AS TT3;

Open in new window


Note: You could have populated the TallyTable table with the numbers produced by these queries.  We will leave our TallyTable2 table with only 10 rows.  It is up to you to decide whether you want to use the code, define and populate queries, or a make-table query to populate your full-sized TallyTable.

As I mentioned earlier, there is a performance penalty when calculating all the numbers in memory on-the-fly.  Since later usage examples only use the queries based on TallyTable2, you ought to understand why I didn't use the all-query number source.
QueryBasedTallyTable performance comparisons
Recordset type           	Avg time   	min   	max
                      =========================    	========	====	====
                      1K table rows (Dynaset) 	22.00          	16     	31
                      1K table rows (snapshot)	24.80        	15     	31
                      1K query rows (Dynaset) 	62.60        	47     	94
                      1K query rows (snapshot)	62.20         	47     	78
                      1M table rows (Dynaset) 	22308.60   	20686	25319
                      1M table rows (Snapshot)	25915.00   	24570	26958
                      1M query rows (Dynaset) 	27827.80   	26411	28596
                      1M query rows (Snapshot)	29272.00   	28392	30575

Open in new window



_____________________________________

Date ranges problem - TallyTable Example 1

This article got its genesis from this recent question about looping in MS-Access SQL.
https://www.experts-exchange.com/questions/26985283/SQL-While-Loop-in-MS-Access.html

The solution needs to produce pairs of dates for reporting periods.  The first column is the start of each logical week in the period and will be one of the following:
Starting date
Monday
First day of the month
The second column is the end of each logical week in the period and will be one of the following:
Ending date
Sunday
Last day of the month
An example of the desired data is posted here:
https://www.experts-exchange.com/questions/26985283/SQL-While-Loop-in-MS-Access.html?anchorAnswerId=35492707#a35492707

During my preparation for this article, I did some performance tuning to the queries, so the following (optimized) queries are different than those posted in the question.  There are two configurations, one based on the TallyTable table and the other based on the QueryBasedTallyTable query.  

Notes:
All the queries have two parameters ([start date] and [end date])
The use of the Distinct keyword is used for performance.  Back in the 90s, I discovered that ordered queries act as if they were indexed.  This may be caused by caching or some other query optimization 'magic'.  If you find that your queries don't execute as quickly as you'd like, try adding Distinct keyword, Order By clause, Or Group By clause.
If you need to update the resulting query results, do not use the Distinct or Group By performance-enhancing trick.  Their use results in a non-updatable record source.

Now that we have our TallyTable and its query equivalent, QueryBasedTallyTable, we can construct our date sources for the two columns.

JustDatesFromTallyTable
Now that we have our numbers, we need a source of dates for our columns.
PARAMETERS [start date] DateTime, [end date] DateTime;
                      SELECT DISTINCT TT.ID, ([start date]+[ID]-1) AS GenDate
                      FROM TallyTable As TT
                      WHERE (TT.ID <= ([end date]-[start date]+1) );

Open in new window


ReportingPeriods_Fast
Now that we have a source of dates, we have a solution to the problem.
SELECT DISTINCT T.GenDate, (Select Min(NxD.gendate) From JustDatesFromTallyTable As NxD  Where (NxD.ID Between T.ID And (T.ID+7)) And (Weekday(Nxd.GenDate)=1 Or NxD.GenDate = [end date] Or NxD.GenDate = DateSerial(Year(NxD.GenDate),Month(NxD.GenDate)+1,0) )) AS EOW
                      FROM JustDatesFromTallyTable AS T
                      WHERE (((Weekday([T].[GenDate]))=2)) OR (((Day([T].[GenDate]))=1)) OR (((T.ID)=1));

Open in new window


JustDatesFromQBTallyTable
Let's not forget the query-based Tally Table we created earlier.  We need to produce dates from these numbers as well.
PARAMETERS [start date] DateTime, [end date] DateTime;
                      SELECT DISTINCT TT.ID, ([start date]+[ID]-1) AS GenDate
                      FROM QueryBasedTallyTable AS TT
                      WHERE (((TT.ID)<=([end date]-[start date]+1)));

Open in new window


ReportingPeriodsQB_Fast
Now we can produce the two-column query for the question, based on the query dates.
SELECT DISTINCT T.GenDate, (Select Min(NxD.gendate) From JustDatesFromQBTallyTable As NxD  Where (NxD.ID Between T.ID And (T.ID+7)) And (Weekday(Nxd.GenDate)=1 Or NxD.GenDate = [end date] Or NxD.GenDate = DateSerial(Year(NxD.GenDate),Month(NxD.GenDate)+1,0) )) AS EOW
                      FROM JustDatesFromQBTallyTable AS T
                      WHERE (((Weekday(T.GenDate))=2)) Or (((Day(T.GenDate))=1)) Or (((T.ID)=1));

Open in new window


How is the performance?
The table-versus-query performance comes down to a space/time trade-off.  The TallyTable takes up space in your database, but the numbers are available immediately.  The query requires a very small (ten row) table, but the numbers must be calculated when used.  The test involved opening a recordset from a parameterized query and moving to the last record.  This was repeated 10 times.  Different configurations of date ranges and recordset types were timed.  The performance test was run at least three times.  In addition, for two of the tests I added an index to the table and a primary key index to the table.  Although the indexes improved performance, we doubled the size of the TallyTable.  The time measurement is ticks.  These times represent a 10 iteration execution.

This is the test of the date range posted in the question (2/22/2011 - 7/9/2011).
There is no index on TallyTable used in this test.
Recordset type   	Avg time   	min   	max
                      ==============    	========	====	====
                      Dynaset (table)  	707.00      	686   	733
                      Snapshot (table)	738.67      	718   	749
                      Snapshot (query)	1456.00     	1451  	1466
                      Dynaset (query) 	1487.33    	1482 	1498

Open in new window


I thought it would be worth comparing a larger date range (2/22/2011 - 8/15/2012), which is 3.6 times the test above.  This is good practice in both capacity and performance testing.  Since my tests indicated that the performance degradation was non-linear, I decided to evaluate the influence of indexes.
*** No index on TallyTable ***
                      Recordset type   	Avg time   	min   	max
                      ==============    	========	====	====
                      Snapshot            	8164.00    	8065 	8315
                      Dynaset             	8398.00    	7956 	8705
                      
                      *** Non-PK Index on TallyTable **
                      Recordset type   	Avg time   	min   	max
                      ===============     	========	====	====
                      Dynaset              	7498.33    	7441 	7535
                      Snapshot           	7555.67    	7519 	7598
                      
                      *** Primary Key index on TallyTable ***
                      Recordset type   	Avg time   	min   	max
                      ==============     	========	====	====
                      Snapshot            	7695.67    	7612 	7816
                      Dynaset              	7748.33    	7551 	8112 
                      
                      *** Query ***
                      Recordset type   	Avg time   	min   	max
                      ==============     	========	====	====
                      Snapshot           	12972.33   	12605	13728
                      Dynaset              	13040.00   	12745	13838

Open in new window



_____________________________________

Simulating Production Data Volume - TallyTable Example 2

It is helpful to stress test your database designs and applications using large sets of data.  When you add your TallyTable (or query) to existing data sources in your queries you can easily multiply its row count.  If you had a 1,000 row table, but new that the production table would be 10 million rows, you could add a TallyTable (or  query) to your test-table query and limit the TallyTable to 10,000 rows.


_____________________________________

Parameterized Top N - TallyTable Example 3

If you have tried to create a Select query with a Top N keyword and ever wished you could parameterize it, a TallyTable might be the solution.

This can be implemented a few different ways, depending on the source and nature of the data.
Join your TallyTable with an ordinalcount column and limit the TallyTable rows with a parameter.
Add (select count(*) ) as an ordinalcount column with direct parameter comparison
Add DCount() as an ordinalcount column with direct parameter comparison

In order to present a reasonable test, I had to make some assumptions about the data.
Either there is no autonumber field or the table has some deleted rows.  This is the reason why I didn't test an inner join between the TallyTable and my performance test table.  Yes, it is bad design/practice to have a table without an autonumber field.  
If you have an autonumber field on a read-only table (no deletes), you could parameterize the Top N by comparing on this field
If you have an autonumber field on a table with deleted rows, you should use that field in your ordinal count calculations, since long integer comparisons are much faster than text field comparisons (used in this test).
I cast my ordinal count column as a long integer data type for faster comparison
I explicitly defined my parameter as a long integer data type for faster comparison.
The data might be a query instead of a table

Note: Beware of lousy performance due to RBAR (row-by-agonizing-row) operations.  This is most pronounced in the Word Count example queries below.

There is a very clever DistinctDCount() alternative -- not shown or tested in this section, since I only care about row counts.
https://www.experts-exchange.com/A_2417.html


SQL Highlights:
To get reasonably good performance from my queries, I made sure that the query parameters were explicitly typed.
PARAMETERS ____ Long;

Open in new window


In the performance figures, the Select refers to a column defined like this:
(Select Count(*) From Lot_Mast As C Where C.[lot-no] >= lot_mast.[lot-no]) AS OrdinalValue

Open in new window


In the performance figures, the DCount refers to a column defined like this.. Note that I have also added a cast to the column data as a long integer.  It was surprising that this made such a performance difference.
CLng(DCount("[lot-no]","[Lot_mast]","[lot-no]>='" & [lot-no] & "'")) AS OrdinalValue

Open in new window


ParameterizedTop_N performance tests - no index on the TallyTable
Recordset type           		Avg time   	min   	max
                      ============================   		========	====	====
                      Select Tally Join (Snapshot)    	156.00      	156   	156
                      Select Tally Join (Dynaset)		161.33      	156   	172
                      Select (Snapshot)			228.67       	218   	234
                      Select (Dynaset)			234.33      	219   	265
                      Dcount Tally Join (Dynaset)		894.67      	874   	921
                      Dcount Tally Join (Snapshot)    	920.67      	906   	936
                      Dcount (Dynaset)			1481.67     	1466  	1513
                      Dcount (Snapshot)			1549.67    	1513 	1591
                      Dcount Tally (Snapshot)  		2106.00    	2090 	2137
                      Dcount Tally (Dynaset)   		2158.00    	2106 	2184
                      Select Tally (Dynaset)   		5179.33    	5117 	5257
                      Select Tally (Snapshot) 		5200.00    	5117 	5288

Open in new window

In case you were curious, the fastest result was with a primary key index on the TallyTable
Recordset type           		Avg time   	min   	max
                      ============================   		========	====	====
                      Select Tally Join (Snapshot)    	140.67   	140	141

Open in new window



_____________________________________

Word/character counts - TallyTable Example 4

There are many times when I need to know how many delimited words there are in a text field.  Usually, I have written a small user-defined function that does the string inspection, since the VBA environment has looping and parsing functions (Split and Replace).  It can also be done using a TallyTable.

I tested two different queries to find the delimiter positions.  One used the MID() function and the other used the InStr() function.

DelimiterPositions
PARAMETERS Delim Text ( 255 );
                      SELECT DISTINCT DataToSplit.PersonID, TallyTable.ID
                      FROM DataToSplit, TallyTable
                      WHERE (((TallyTable.ID)<=255) AND ((CBool(Len(Mid([Delim] & [data] & [Delim],[id],2))=Len([Delim]) And Mid([Delim] & [data] & [Delim],[id],2)=[Delim]))=True));

Open in new window


DelimiterPositions_Instr
PARAMETERS Delim Text ( 255 );
                      SELECT DISTINCT DataToSplit.PersonID, TallyTable.ID
                      FROM DataToSplit, TallyTable
                      WHERE (((TallyTable.ID)<=255) AND (Instr(ID,[Delim] & [data] & [Delim],[Delim])=[ID]));

Open in new window


Word count
Once you know the delimiter positions, you can use a Group By query to get the count of words.  Since I added delimiters to the front and back of the text data in my DelimiterPositions data, I had to subtract one from the Count() value.
SELECT DelimiterPositions.PersonID, Count(DelimiterPositions.ID)-1 AS WordCount
                      FROM DelimiterPositions
                      GROUP BY DelimiterPositions.PersonID;

Open in new window


Parsed/Split text data
Since I've already calculated the delimiter positions, It is an easy step to parse the text.
PARAMETERS Delim Text ( 255 );
                      SELECT DataToSplit.PersonID, Trim(Mid([data],[DelimiterPositions_Instr].[ID],[NextDelimPosn]-[DelimiterPositions_Instr].[ID])) AS [Procedure], DelimiterPositions_Instr.ID, InStr([ID]+1,[delim] & [data] & [Delim],[Delim]) AS NextDelimPosn
                      FROM DataToSplit INNER JOIN DelimiterPositions_Instr ON DataToSplit.PersonID = DelimiterPositions_Instr.PersonID
                      WHERE (((InStr([ID]+1,[delim] & [data] & [Delim],[Delim]))<>0))
                      ORDER BY DataToSplit.PersonID, DelimiterPositions_Instr.ID;

Open in new window


Note: Although I could have done this parsing by joining the DelimiterPositions_Instr query to itself, it was not nearly as good a performer as the ParsedData query above.

VBA Function Splitting
If you find that you need a more sophisticated word count or splitting process, you should read these articles:
https://www.experts-exchange.com/A_1679.html
https://www.experts-exchange.com/A_1480.html

If this article has been helpful, please click the Yes link below.
5
13,413 Views
aikimarkGet vaccinated; Social distance; Wear a mask
CERTIFIED EXPERT

Comments (2)

aikimarkGet vaccinated; Social distance; Wear a mask
CERTIFIED EXPERT
Top Expert 2014

Author

Commented:
Attention readers:

I caught an error on the first of the delimiter position queries.  The length parameter of the Mid() function was a literal.  It should have been equal to the length of the [Delim] parameter.  I also simplified the query, although it still does not perform as well as the InStr version of the query.

DelimiterPositions
PARAMETERS Delim Text ( 255 );
SELECT DISTINCT DataToSplit.PersonID, TallyTable.ID
FROM DataToSplit, TallyTable
WHERE ((TallyTable.ID<=255) 
AND 
(Mid([Delim] & [data] & [Delim],[id],Len([Delim]))=[Delim]));

Open in new window

aikimarkGet vaccinated; Social distance; Wear a mask
CERTIFIED EXPERT
Top Expert 2014

Author

Commented:
Here is a similar question, where a data source (query) contains startdate and stopdate columns and the member needed to populate another table with multiple rows, containing individual dates between these two date column values.

A new constraint in this question is the need to prevent duplicate sets of rows when the process was repeated.  Therefore, a new table was required (HasBeenCloned) to persist the key values involved in the copying process.

http:/Q_27663167.html

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.