<

Creating and Using a Tally Table in Access

Published on
19,989 Points
9,989 Views
5 Endorsements
Last Modified:
Awarded

Introduction

This article introduces the reader to the concept of a table (or query) of numbers, commonly called a Tally Table.  I detail different methods to create and populate a Tally Table in Access and show the new table used as part of a problem solution.

For years I've used an OneRow table as a part of my Access solutions, eliminating many duplicate values in a very efficient manner.  In the past couple of years, I have seen a similar special table in SQL Server and MySQL articles.  I have now participated in an EE Access question whose solution lent itself to a Tally Table.

Definition: A Tally Table is a record source (database table or query) that is a contiguous sequence of integer values in consecutive rows.  This is sometimes called a number table.

Why use a Tally Table: There are many situations where you need a sequence of values.  While it is easy to write loops in VBA and TSQL, there are no looping statements in Access SQL syntax.  While a Tally Table is a sequence of integer values, such values can be transformed into other Access/VBA data types (character, floating point, date, etc.)

If you are interested in using a Tally Table for a non-Access database, start by reading these articles.
SQL Server - http://www.experts-exchange.com/A_1221.html
SQL Server - http://www.sqlservercentral.com/articles/Tally+Table/72993
MySQL - http://www.experts-exchange.com/A_3573.html
SQLite - http://www.experts-exchange.com/A_3570.html


_____________________________________

Creating the Tally Table

Since an Access Select statement requires a From clause, we can't use the TSQL method of creating the number sequence strictly within the stored procedure, with no physical storage in a table:
Select 1
Union All Select 2
Union All Select 3

Open in new window

Once we have some minimal sequence in a table, we can use table joining 'magic' (matrix product growth) in a query to produce a much larger sequence of numbers.  I will show you a trick to create a query-only source of numbers, but I do a little cheating with my From clause.

Note: In the following code examples, I am starting the numbering sequence after highest current number, using 1 if the table is empty.  This is an efficiency consideration -- appending new numbers rather than trying to repopulate the entire table.  Alternatively, this code can be changed to delete all the rows before populating the (now) empty table.  


_____________________________________

Create the TallyTable with code

Public Sub MakeTallyTable(Optional parmLimit As Long = 1000, Optional parmPopulate As Boolean = False)
  Dim lngLoop As Long
  Dim td As TableDef
  Dim fd As Field
  Dim vID As Variant
  Const cCommitInterval As Long = 500
  
  'Create the Tally Table if it doesn't already exist.
  On Error Resume Next
  vID = DBEngine(0)(0).TableDefs("TallyTable").Name
  If vID = "TallyTable" Then
  Else
    Set td = DBEngine(0)(0).CreateTableDef("TallyTable")
    Set fd = New Field
    fd.Name = "ID"
    fd.Type = dbLong
    td.Fields.Append fd
    DBEngine(0)(0).TableDefs.Append td
  End If

'************************
'Optional population of the table
  If parmPopulate Then
    vID = DMax("ID", "TallyTable")
    If IsNull(vID) Then
      vID = 0
    End If
    If vID < parmLimit Then
      DBEngine(0).BeginTrans
      For lngLoop = vID + 1 To parmLimit
        DBEngine(0)(0).Execute "Insert Into TallyTable (ID) Values (" & lngLoop & ")"
        If (lngLoop Mod cCommitInterval) = 0 Then
          DBEngine(0).CommitTrans
          DBEngine(0).BeginTrans
        End If
      Next
      DBEngine(0).CommitTrans
    End If
  End If
'************************
End Sub

Open in new window



_____________________________________

Create the TallyTable with SQL

Create Table TallyTable2
  (ID Long)

Open in new window

Note: You can also rewrite the first 'populating query' below, converting them into a Make Table query.

_____________________________________

Populating the Tally Table with code

Public Sub PopulateTallyTable(Optional parmLimit As Long = 1000)
  Dim lngLoop As Long
  Const cCommitInterval As Long = 500
  
  vID = DMax("ID", "TallyTable")
  If IsNull(vID) Then
    vID = 0
  End If
  If vID < parmLimit Then
    DBEngine(0).BeginTrans
    For lngLoop = vID + 1 To parmLimit
      DBEngine(0)(0).Execute "Insert Into TallyTable (ID) Values (" & lngLoop & ")"
      If (lngLoop Mod cCommitInterval) = 0 Then
        DBEngine(0).CommitTrans
        DBEngine(0).BeginTrans
      End If
    Next
    DBEngine(0).CommitTrans
  End If
End Sub

Open in new window



_____________________________________

Populating the Tally Table with SQL

When using queries to generate a number sequence, there still must be a table with one or more rows.  To keep this example simple, I'm populating my source table with ten numbers (1-10).  You can use a non-empty system table if you don't already have a non-empty table.

Note: The following SQL will not run, as shown, in Access97 since it doesn't support the From (Select...) form of SQL.  Access97 developers can work around that limitation by saving the Unioned Select statements as a separate query and then include the name of that query in your From clause.

I'm using the MSysObjects table in this example.
QBTallyTableBase10 query:
SELECT DISTINCT ID
FROM (SELECT Top 1  1  As ID FROM MSysObjects
  UNION ALL
SELECT Top 1  2   FROM MSysObjects
  UNION ALL
SELECT Top 1  3  FROM MSysObjects
  UNION ALL
SELECT Top 1  4  FROM MSysObjects
  UNION ALL
SELECT Top 1  5  FROM MSysObjects
  UNION ALL
SELECT Top 1  6  FROM MSysObjects
  UNION ALL
SELECT Top 1  7  FROM MSysObjects
  UNION ALL
SELECT Top 1  8  FROM MSysObjects
  UNION ALL
SELECT Top 1  9  FROM MSysObjects
  UNION ALL 
SELECT Top 1 10 FROM MSysObjects);

Open in new window


However, there is a performance penalty for generating all the numbers from such 'query magic'.  I created a TallyTable2 with the same ten numbers and compare the performance of 1K and 1M number retrievals below.
Populate an already defined TallyTable2 from the magic query.
Insert Into TallyTable2 (ID)
Select ID From QBTallyTableBase10

Open in new window


Note: You can also create and populate the TallyTable2 with a Make-Table form of the query:

Make-Table Example 1 -- Single query
SELECT DISTINCT ID INTO TallyTable2
FROM (SELECT Top 1  1  As ID FROM MSysObjects
  UNION ALL
SELECT Top 1  2   FROM MSysObjects
  UNION ALL
SELECT Top 1  3  FROM MSysObjects
  UNION ALL
SELECT Top 1  4  FROM MSysObjects
  UNION ALL
SELECT Top 1  5  FROM MSysObjects
  UNION ALL
SELECT Top 1  6  FROM MSysObjects
  UNION ALL
SELECT Top 1  7  FROM MSysObjects
  UNION ALL
SELECT Top 1  8  FROM MSysObjects
  UNION ALL
SELECT Top 1  9  FROM MSysObjects
  UNION ALL 
SELECT Top 1  10 FROM MSysObjects);

Open in new window


Make-Table Example 2 -- Using QBTallyTableBase10
SELECT QBTallyTableBase10.ID INTO TallyTable2
FROM QBTallyTableBase10;

Open in new window


Now that we have a small set of (ten) numbers, we use the multiplicative property of table joins to produce much larger sets.
QueryBasedTallyTable_BIG -- 1M numbers using 10-row table
SELECT ((QBTT6.ID-1)*100000)+((QBTT5.ID-1)*10000)+((QBTT4.ID-1)*1000)+((QBTT3.ID-1)*100)+((QBTT2.ID-1)*10)+QBTT.ID AS ID
FROM TallyTable2 AS QBTT, TallyTable2 AS QBTT2, TallyTable2 AS QBTT3, TallyTable2 AS QBTT4, TallyTable2 AS QBTT5, TallyTable2 AS QBTT6;

Open in new window


QueryBasedTallyTable -- 1K numbers using 10-row table
SELECT ((QBTT3.ID-1)*100)+((QBTT2.ID-1)*10)+QBTT.ID AS ID
FROM TallyTable2 AS QBTT, TallyTable2 AS QBTT2, TallyTable2 AS QBTT3;

Open in new window


QueryBasedTallyTable_BIG_AllQry -- 1M numbers using only QBTallyTableBase10 query data
SELECT ((TT6.ID-1)*100000)+((TT5.ID-1)*10000)+((TT4.ID-1)*1000)+((TT3.ID-1)*100)+((TT2.ID-1)*10)+TT.ID AS ID
FROM QBTallyTableBase10 AS TT, QBTallyTableBase10 AS TT2, QBTallyTableBase10 AS TT3, QBTallyTableBase10 AS TT4, QBTallyTableBase10 AS TT5, QBTallyTableBase10 AS TT6;

Open in new window


QueryBasedTallyTable_AllQry -- 1K numbers using only QBTallyTableBase10 query data
SELECT DISTINCT ((TT3.ID-1)*100)+((TT2.ID-1)*10)+TT.ID AS ID
FROM QBTallyTableBase10 AS TT, QBTallyTableBase10 AS TT2, QBTallyTableBase10 AS TT3;

Open in new window


Note: You could have populated the TallyTable table with the numbers produced by these queries.  We will leave our TallyTable2 table with only 10 rows.  It is up to you to decide whether you want to use the code, define and populate queries, or a make-table query to populate your full-sized TallyTable.

As I mentioned earlier, there is a performance penalty when calculating all the numbers in memory on-the-fly.  Since later usage examples only use the queries based on TallyTable2, you ought to understand why I didn't use the all-query number source.
QueryBasedTallyTable performance comparisons
Recordset type           	Avg time   	min   	max
=========================    	========	====	====
1K table rows (Dynaset) 	22.00          	16     	31
1K table rows (snapshot)	24.80        	15     	31
1K query rows (Dynaset) 	62.60        	47     	94
1K query rows (snapshot)	62.20         	47     	78
1M table rows (Dynaset) 	22308.60   	20686	25319
1M table rows (Snapshot)	25915.00   	24570	26958
1M query rows (Dynaset) 	27827.80   	26411	28596
1M query rows (Snapshot)	29272.00   	28392	30575

Open in new window



_____________________________________

Date ranges problem - TallyTable Example 1

This article got its genesis from this recent question about looping in MS-Access SQL.
http://www.experts-exchange.com/Q_26985283.html

The solution needs to produce pairs of dates for reporting periods.  The first column is the start of each logical week in the period and will be one of the following:
Starting date
Monday
First day of the month
The second column is the end of each logical week in the period and will be one of the following:
Ending date
Sunday
Last day of the month
An example of the desired data is posted here:
http://www.experts-exchange.com/Q_26985283.html#a35492707

During my preparation for this article, I did some performance tuning to the queries, so the following (optimized) queries are different than those posted in the question.  There are two configurations, one based on the TallyTable table and the other based on the QueryBasedTallyTable query.  

Notes:
All the queries have two parameters ([start date] and [end date])
The use of the Distinct keyword is used for performance.  Back in the 90s, I discovered that ordered queries act as if they were indexed.  This may be caused by caching or some other query optimization 'magic'.  If you find that your queries don't execute as quickly as you'd like, try adding Distinct keyword, Order By clause, Or Group By clause.
If you need to update the resulting query results, do not use the Distinct or Group By performance-enhancing trick.  Their use results in a non-updatable record source.

Now that we have our TallyTable and its query equivalent, QueryBasedTallyTable, we can construct our date sources for the two columns.

JustDatesFromTallyTable
Now that we have our numbers, we need a source of dates for our columns.
PARAMETERS [start date] DateTime, [end date] DateTime;
SELECT DISTINCT TT.ID, ([start date]+[ID]-1) AS GenDate
FROM TallyTable As TT
WHERE (TT.ID <= ([end date]-[start date]+1) );

Open in new window


ReportingPeriods_Fast
Now that we have a source of dates, we have a solution to the problem.
SELECT DISTINCT T.GenDate, (Select Min(NxD.gendate) From JustDatesFromTallyTable As NxD  Where (NxD.ID Between T.ID And (T.ID+7)) And (Weekday(Nxd.GenDate)=1 Or NxD.GenDate = [end date] Or NxD.GenDate = DateSerial(Year(NxD.GenDate),Month(NxD.GenDate)+1,0) )) AS EOW
FROM JustDatesFromTallyTable AS T
WHERE (((Weekday([T].[GenDate]))=2)) OR (((Day([T].[GenDate]))=1)) OR (((T.ID)=1));

Open in new window


JustDatesFromQBTallyTable
Let's not forget the query-based Tally Table we created earlier.  We need to produce dates from these numbers as well.
PARAMETERS [start date] DateTime, [end date] DateTime;
SELECT DISTINCT TT.ID, ([start date]+[ID]-1) AS GenDate
FROM QueryBasedTallyTable AS TT
WHERE (((TT.ID)<=([end date]-[start date]+1)));

Open in new window


ReportingPeriodsQB_Fast
Now we can produce the two-column query for the question, based on the query dates.
SELECT DISTINCT T.GenDate, (Select Min(NxD.gendate) From JustDatesFromQBTallyTable As NxD  Where (NxD.ID Between T.ID And (T.ID+7)) And (Weekday(Nxd.GenDate)=1 Or NxD.GenDate = [end date] Or NxD.GenDate = DateSerial(Year(NxD.GenDate),Month(NxD.GenDate)+1,0) )) AS EOW
FROM JustDatesFromQBTallyTable AS T
WHERE (((Weekday(T.GenDate))=2)) Or (((Day(T.GenDate))=1)) Or (((T.ID)=1));

Open in new window


How is the performance?
The table-versus-query performance comes down to a space/time trade-off.  The TallyTable takes up space in your database, but the numbers are available immediately.  The query requires a very small (ten row) table, but the numbers must be calculated when used.  The test involved opening a recordset from a parameterized query and moving to the last record.  This was repeated 10 times.  Different configurations of date ranges and recordset types were timed.  The performance test was run at least three times.  In addition, for two of the tests I added an index to the table and a primary key index to the table.  Although the indexes improved performance, we doubled the size of the TallyTable.  The time measurement is ticks.  These times represent a 10 iteration execution.

This is the test of the date range posted in the question (2/22/2011 - 7/9/2011).
There is no index on TallyTable used in this test.
Recordset type   	Avg time   	min   	max
==============    	========	====	====
Dynaset (table)  	707.00      	686   	733
Snapshot (table)	738.67      	718   	749
Snapshot (query)	1456.00     	1451  	1466
Dynaset (query) 	1487.33    	1482 	1498

Open in new window


I thought it would be worth comparing a larger date range (2/22/2011 - 8/15/2012), which is 3.6 times the test above.  This is good practice in both capacity and performance testing.  Since my tests indicated that the performance degradation was non-linear, I decided to evaluate the influence of indexes.
*** No index on TallyTable ***
Recordset type   	Avg time   	min   	max
==============    	========	====	====
Snapshot            	8164.00    	8065 	8315
Dynaset             	8398.00    	7956 	8705

*** Non-PK Index on TallyTable **
Recordset type   	Avg time   	min   	max
===============     	========	====	====
Dynaset              	7498.33    	7441 	7535
Snapshot           	7555.67    	7519 	7598

*** Primary Key index on TallyTable ***
Recordset type   	Avg time   	min   	max
==============     	========	====	====
Snapshot            	7695.67    	7612 	7816
Dynaset              	7748.33    	7551 	8112 

*** Query ***
Recordset type   	Avg time   	min   	max
==============     	========	====	====
Snapshot           	12972.33   	12605	13728
Dynaset              	13040.00   	12745	13838

Open in new window



_____________________________________

Simulating Production Data Volume - TallyTable Example 2

It is helpful to stress test your database designs and applications using large sets of data.  When you add your TallyTable (or query) to existing data sources in your queries you can easily multiply its row count.  If you had a 1,000 row table, but new that the production table would be 10 million rows, you could add a TallyTable (or  query) to your test-table query and limit the TallyTable to 10,000 rows.


_____________________________________

Parameterized Top N - TallyTable Example 3

If you have tried to create a Select query with a Top N keyword and ever wished you could parameterize it, a TallyTable might be the solution.

This can be implemented a few different ways, depending on the source and nature of the data.
Join your TallyTable with an ordinalcount column and limit the TallyTable rows with a parameter.
Add (select count(*) ) as an ordinalcount column with direct parameter comparison
Add DCount() as an ordinalcount column with direct parameter comparison

In order to present a reasonable test, I had to make some assumptions about the data.
Either there is no autonumber field or the table has some deleted rows.  This is the reason why I didn't test an inner join between the TallyTable and my performance test table.  Yes, it is bad design/practice to have a table without an autonumber field.  
If you have an autonumber field on a read-only table (no deletes), you could parameterize the Top N by comparing on this field
If you have an autonumber field on a table with deleted rows, you should use that field in your ordinal count calculations, since long integer comparisons are much faster than text field comparisons (used in this test).
I cast my ordinal count column as a long integer data type for faster comparison
I explicitly defined my parameter as a long integer data type for faster comparison.
The data might be a query instead of a table

Note: Beware of lousy performance due to RBAR (row-by-agonizing-row) operations.  This is most pronounced in the Word Count example queries below.

There is a very clever DistinctDCount() alternative -- not shown or tested in this section, since I only care about row counts.
http://www.experts-exchange.com/A_2417.html


SQL Highlights:
To get reasonably good performance from my queries, I made sure that the query parameters were explicitly typed.
PARAMETERS ____ Long;

Open in new window


In the performance figures, the Select refers to a column defined like this:
(Select Count(*) From Lot_Mast As C Where C.[lot-no] >= lot_mast.[lot-no]) AS OrdinalValue

Open in new window


In the performance figures, the DCount refers to a column defined like this.. Note that I have also added a cast to the column data as a long integer.  It was surprising that this made such a performance difference.
CLng(DCount("[lot-no]","[Lot_mast]","[lot-no]>='" & [lot-no] & "'")) AS OrdinalValue

Open in new window


ParameterizedTop_N performance tests - no index on the TallyTable
Recordset type           		Avg time   	min   	max
============================   		========	====	====
Select Tally Join (Snapshot)    	156.00      	156   	156
Select Tally Join (Dynaset)		161.33      	156   	172
Select (Snapshot)			228.67       	218   	234
Select (Dynaset)			234.33      	219   	265
Dcount Tally Join (Dynaset)		894.67      	874   	921
Dcount Tally Join (Snapshot)    	920.67      	906   	936
Dcount (Dynaset)			1481.67     	1466  	1513
Dcount (Snapshot)			1549.67    	1513 	1591
Dcount Tally (Snapshot)  		2106.00    	2090 	2137
Dcount Tally (Dynaset)   		2158.00    	2106 	2184
Select Tally (Dynaset)   		5179.33    	5117 	5257
Select Tally (Snapshot) 		5200.00    	5117 	5288

Open in new window

In case you were curious, the fastest result was with a primary key index on the TallyTable
Recordset type           		Avg time   	min   	max
============================   		========	====	====
Select Tally Join (Snapshot)    	140.67   	140	141

Open in new window



_____________________________________

Word/character counts - TallyTable Example 4

There are many times when I need to know how many delimited words there are in a text field.  Usually, I have written a small user-defined function that does the string inspection, since the VBA environment has looping and parsing functions (Split and Replace).  It can also be done using a TallyTable.

I tested two different queries to find the delimiter positions.  One used the MID() function and the other used the InStr() function.

DelimiterPositions
PARAMETERS Delim Text ( 255 );
SELECT DISTINCT DataToSplit.PersonID, TallyTable.ID
FROM DataToSplit, TallyTable
WHERE (((TallyTable.ID)<=255) AND ((CBool(Len(Mid([Delim] & [data] & [Delim],[id],2))=Len([Delim]) And Mid([Delim] & [data] & [Delim],[id],2)=[Delim]))=True));

Open in new window


DelimiterPositions_Instr
PARAMETERS Delim Text ( 255 );
SELECT DISTINCT DataToSplit.PersonID, TallyTable.ID
FROM DataToSplit, TallyTable
WHERE (((TallyTable.ID)<=255) AND (Instr(ID,[Delim] & [data] & [Delim],[Delim])=[ID]));

Open in new window


Word count
Once you know the delimiter positions, you can use a Group By query to get the count of words.  Since I added delimiters to the front and back of the text data in my DelimiterPositions data, I had to subtract one from the Count() value.
SELECT DelimiterPositions.PersonID, Count(DelimiterPositions.ID)-1 AS WordCount
FROM DelimiterPositions
GROUP BY DelimiterPositions.PersonID;

Open in new window


Parsed/Split text data
Since I've already calculated the delimiter positions, It is an easy step to parse the text.
PARAMETERS Delim Text ( 255 );
SELECT DataToSplit.PersonID, Trim(Mid([data],[DelimiterPositions_Instr].[ID],[NextDelimPosn]-[DelimiterPositions_Instr].[ID])) AS [Procedure], DelimiterPositions_Instr.ID, InStr([ID]+1,[delim] & [data] & [Delim],[Delim]) AS NextDelimPosn
FROM DataToSplit INNER JOIN DelimiterPositions_Instr ON DataToSplit.PersonID = DelimiterPositions_Instr.PersonID
WHERE (((InStr([ID]+1,[delim] & [data] & [Delim],[Delim]))<>0))
ORDER BY DataToSplit.PersonID, DelimiterPositions_Instr.ID;

Open in new window


Note: Although I could have done this parsing by joining the DelimiterPositions_Instr query to itself, it was not nearly as good a performer as the ParsedData query above.

VBA Function Splitting
If you find that you need a more sophisticated word count or splitting process, you should read these articles:
http://www.experts-exchange.com/A_1679.html
http://www.experts-exchange.com/A_1480.html

If this article has been helpful, please click the Yes link below.
5
Comment
Author:aikimark
  • 2
2 Comments
 
LVL 47

Author Comment

by:aikimark
Attention readers:

I caught an error on the first of the delimiter position queries.  The length parameter of the Mid() function was a literal.  It should have been equal to the length of the [Delim] parameter.  I also simplified the query, although it still does not perform as well as the InStr version of the query.

DelimiterPositions
PARAMETERS Delim Text ( 255 );
SELECT DISTINCT DataToSplit.PersonID, TallyTable.ID
FROM DataToSplit, TallyTable
WHERE ((TallyTable.ID<=255) 
AND 
(Mid([Delim] & [data] & [Delim],[id],Len([Delim]))=[Delim]));

Open in new window

0
 
LVL 47

Author Comment

by:aikimark
Here is a similar question, where a data source (query) contains startdate and stopdate columns and the member needed to populate another table with multiple rows, containing individual dates between these two date column values.

A new constraint in this question is the need to prevent duplicate sets of rows when the process was repeated.  Therefore, a new table was required (HasBeenCloned) to persist the key values involved in the copying process.

http:/Q_27663167.html
0

Featured Post

Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

Join & Write a Comment

Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…
A query can call a function, and a function can call Excel, even though we are in Access. This is Part 2, and steps you through the VBA that "wraps" Excel functionality so we can use its worksheet functions in Access. The declaration statement de…

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month