Complex sequential query with vba functions takes a long time

I have a table where each record is a series of start and stop dates for each of the steps in a sequence of tasks (with 20 separate tasks per record, and about 1500 records).  A couple of the tasks are optional, and there are some parallel paths (e.g. it always starts at step A, then goes through one or both of step B or "step C then step D" before meeting again at step E and so on - I don't think that in itself makes the calculation any harder, but it does tell us that the calculation required for each step is not consistent - for example, step D starts when step C finishes, but step E can only start when B and D have finished).

The table records progress as the steps are completed, but I then have to produce a forecast of the end date for the steps which have not completed (and mostly have not started).  I have defined a query to generate that forecast.

The estimation for the forecast starts at the step in the sequence which has started but not yet completed - I estimate the end date for that step based on a known lead time for that step, and some user-entered delays (in a separate table) for that step.  The calculation then proceeds sequentially - I take that estimated end date and use it as the estimated start date for the next step.  I then calculate the estimated end date for that next step, and use that as the estimated start date for the next step after that, and so on.

My problem is that Access does not do the calculation sequentially - it appears to perform all of the calculations in the query at once, then realises that some of the fields rely on other fields which have changed, so recalculates those dependent fields, that then causes other dependent fields to recalculate and so on. Eventually the recalculation does not affect any other fields, and the query is complete.  In a couple of the fields, the calculation is so complex that I have had to use VBA functions - I put some debug in one of them, and though it only need to be called 16 times to complete one record, it was actually called over 200 times.

I'm sure there is a better way to do this - I can't put all of the calculations in one big VBA function because I need to see those intermediate estimated start end end dates (to allow me to forecast when people will be needed to do the work in that particular step).  Is there a way to make Access do the calculations in the correct order?
LVL 20
Who is Participating?
mbizupConnect With a Mentor Commented:
Your question would be easier to follow if you posted a sample, but if I'm understanding you correctly your query looks something like this:

SELECT Field1, Field2, CustomVBAFunction1(Field3) AS Calculation1,  CustomVBAFunction2(Field4) AS Calculation2, etc...
FROM YourTable

Open in new window

VBA functions embedded in your queries' SQL tend to slow queries down because they are performed on all records selected.

A quicker approach might be to select the needed fields (without the functions) and use recordset code to call those functions only for records meeting whatever criteria you need... something like this (this is very sketchy because we don't have much detail about your specific query/functions):

Dim rs as dao.recordset
dim strSQL as string

strSQL = "SELECT Field1, Field2, Field3, etc.... FROM YourTable"

SET rs = currentDB.OpenRecordset(strSQL, dbOpenDynaset)

if rs!Field1 = SomeValue then
      Something = VBAFunction1(rs!Field2)
End if
' Do something with this calculation
' etc...

Open in new window

You appear to be saying that you get the right answers.

So the problem is what ?
simon3270Author Commented:
Yes, a sample would have been more useful :-)

My problem is the time taken - it's currently over an hour to generate the 1500 results - I made the calculation a bit more complex (the ways delays were defined changed), and it was taking tens of minutes per record (I didn't wait for all 1500!).

The original table has 6 fields describing the record, then 23 pairs of start and stop dates, (11 pairs describing the processing path of one end of a circuit, 11 describing the other end, and one describing the work done on the entire circuit when both ends are complete).

A typical set of fields in the query, processing a single pair of dates for Step 04 (Step04StartDate and Step04EndDate) in the sequence (immediately follows Step 03), could be:
IIf(IsNull([Step03EndDate]),[Step03EstEndDate],[Step03EndDate]) AS Step04EstStartDate,
DLookUp("[LeadTime]","[LeadTimes]","[level] = 'Step04'") AS Step04LeadTime,
DSum("[Days]","[Delay]","[Source] = 'Span'  AND [RecID] = '" & [Span].[SpanID] & "' AND [Step] = 'Step04'") AS Step04Delays,
DateAddW('w',Val(Nz([Step04LeadTime],0))+Val(Nz([Step04Delays],0)),IIf(IsNull([Step04StartDate]),[Step04EstStartDate],[Step04StartDate])) AS Step04EstEndDate,

Open in new window

The first line uses the actual end date of Step03 if it is defined in the source table, otherwise uses the estimated end date of Step03, to get an estimated start date for this step.  Line 2 is the actual start date for this step from the source table, which may or may not be filled in.  Line 3 looks up the lead time in days for this step in a LeadTimes table (in a table so that the values can be changed to allow what-if analysis).  Line 4 adds up any extra delay days specific to this step for this record (stored in a separate Delays table, added there by the user - there may be none for this step).  Line 5 calls my VBA function to add the Lead Time and Delay values to the start date, skipping weekends, bank holidays and other "freeze" periods, to get an estimated end date.  Line 6 is a copy of the actual end date for this step from the source table (again, will only be filled in when the step actually ends).

Some of the steps are optional so have no estimated start date, but do have an estimated end date based on the actual start date if it is filled in.  Others depend on multiple end dates to calculate their estimated start date.

I made the calculation more complex by allowing delays to be defined as a number of days or by a specific end date for the delay.  I did this by changing line 4 to pass start dates and lead times for the Step and the identifying information from the DSum call to a VBA function, so that the function could then process the delays and determine whether they were day-based or date-based delays.  It was then that the processing time rocketed from  a second or two per record to 10s of minutes per record.
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

simon3270Author Commented:
As for selection of the records I need to process, it is either all records (to generate reports), or one record (to display a form with that records actual and estimated dates).
Jim Dettman (Microsoft MVP/ EE MVE)Connect With a Mentor President / OwnerCommented:
Couple of comments in general.

On this:

<<My problem is that Access does not do the calculation sequentially >>
 When a VBA function is called from a query, it's called once if it passes no arguments to the function

 If a field argument is passed, it is called once for each row.

 If a column has criteria on it, it is called twice for each row.

On this SQL:


   Using Domain function in a query (SQL Statement) is a big no-no performance wise.   The query parser cannot optimize the query in any way.

  All the domain functions represent SQL Statements and were intended to be used where SQL statements are not.  A query (which is a SQL statement) is not one of those places.

  Your Dlookup() for example should be a join to another table.

  Also, you are using an IIF().   With IIF(), both true and false arguments are evaluated whether needed or not.   Don't use IIF()'s in a SQL statement unless the arguments are simple.

  Beyond all that, follow Miriam's advice.

Jim Dettman (Microsoft MVP/ EE MVE)President / OwnerCommented:
and BTW, on this:

<<The original table has 6 fields describing the record, then 23 pairs of start and stop dates, (11 pairs describing the processing path of one end of a circuit, 11 describing the other end, and one describing the work done on the entire circuit when both ends are complete).>>

 It sounds like the table design is what we call non-normalized.   You should not have repeating fields in a table (start1, start2, start3, etc).

  This generally indicates the need to break out the data into a seperate table.

  This is most likely another reason your seeing poor query performance.

  Normalizing the table design for a relational database is critical for good performance.

  I would go with a VBA function as Miriam outlined.

simon3270Author Commented:
Thanks to both of you for the suggestions.  It looks as though I have a little work to do!
Dale FyeCommented:
1.  Have you tried doing this in a series of queries rather than a single query.  With a series of queries, you can focus on specific relationships (A to B) (B to C), ...

You could also use criteria to target specific records, so you handle the records that go from A-B-C-D-E differently than those records where the sequence is A-B-E.

This would mean fewer Iif( ) functions and possible reduce the number of other function calls.
simon3270Author Commented:
@fyed: I think the number of queries would need to be quite large, and it would be hard to guarantee that records were all processed but only processed once - the A and B ends of the circuit proceed independently and usually by different routes (one end might use A-C-D-E, the other A-E, or A-B-C-D-E, or both might be A-B-E, and the ends will almost certainly be at different stages on that path).  I'm looking at VBA for the entire record and Joins for improvements in the first place (I had in fact started doing the VBA option a few weeks ago, but the Query was simpler then so the performance improvement at the time was minimal).
Dale FyeCommented:
Doesn't matter how many queries it takes, but ensuring that every record is processed can be an issue.  Using a VBA procedure to process records one at a time is rarely more efficient than running queries, but is sometimes necessary.

Personally, I think your table structure is flawed (20+ different process steps with separate in and out times begs fo a structure like:

Process_type (this would define the steps in the particular process.
simon3270Author Commented:
Yes, I think it is flawed, but we are where we are!

Having a separate Steps table might be doable (just have to make time to rewrite everything), but a couple more requirements came up this afternoon which definitely drive me towards one big VBA routine (they want another couple of steps added which have very different delay mechanisms and impacts - it's just getting too complicated to manage a query).
Dale FyeCommented:
"... they want another couple of steps added ..."

Is the main reason for going with the structure I mentioned above.  You never know when someone is going to want more steps in the process (more columns).  More columns means re-coding, rewriting queries, ...

With the structure I recommended, and probably another table the describes which steps belong in which processes, you can avoid most of the coding and query re-writes.

Good luck.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.