Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Complex sequential query with vba functions takes a long time

Posted on 2013-05-31
12
Medium Priority
?
651 Views
Last Modified: 2013-05-31
I have a table where each record is a series of start and stop dates for each of the steps in a sequence of tasks (with 20 separate tasks per record, and about 1500 records).  A couple of the tasks are optional, and there are some parallel paths (e.g. it always starts at step A, then goes through one or both of step B or "step C then step D" before meeting again at step E and so on - I don't think that in itself makes the calculation any harder, but it does tell us that the calculation required for each step is not consistent - for example, step D starts when step C finishes, but step E can only start when B and D have finished).

The table records progress as the steps are completed, but I then have to produce a forecast of the end date for the steps which have not completed (and mostly have not started).  I have defined a query to generate that forecast.

The estimation for the forecast starts at the step in the sequence which has started but not yet completed - I estimate the end date for that step based on a known lead time for that step, and some user-entered delays (in a separate table) for that step.  The calculation then proceeds sequentially - I take that estimated end date and use it as the estimated start date for the next step.  I then calculate the estimated end date for that next step, and use that as the estimated start date for the next step after that, and so on.

My problem is that Access does not do the calculation sequentially - it appears to perform all of the calculations in the query at once, then realises that some of the fields rely on other fields which have changed, so recalculates those dependent fields, that then causes other dependent fields to recalculate and so on. Eventually the recalculation does not affect any other fields, and the query is complete.  In a couple of the fields, the calculation is so complex that I have had to use VBA functions - I put some debug in one of them, and though it only need to be called 16 times to complete one record, it was actually called over 200 times.

I'm sure there is a better way to do this - I can't put all of the calculations in one big VBA function because I need to see those intermediate estimated start end end dates (to allow me to forecast when people will be needed to do the work in that particular step).  Is there a way to make Access do the calculations in the correct order?
0
Comment
Question by:simon3270
  • 5
  • 3
  • 2
  • +2
12 Comments
 
LVL 77

Expert Comment

by:peter57r
ID: 39210226
You appear to be saying that you get the right answers.

So the problem is what ?
0
 
LVL 61

Accepted Solution

by:
mbizup earned 1000 total points
ID: 39210238
Your question would be easier to follow if you posted a sample, but if I'm understanding you correctly your query looks something like this:

SELECT Field1, Field2, CustomVBAFunction1(Field3) AS Calculation1,  CustomVBAFunction2(Field4) AS Calculation2, etc...
FROM YourTable

Open in new window



VBA functions embedded in your queries' SQL tend to slow queries down because they are performed on all records selected.

A quicker approach might be to select the needed fields (without the functions) and use recordset code to call those functions only for records meeting whatever criteria you need... something like this (this is very sketchy because we don't have much detail about your specific query/functions):


Dim rs as dao.recordset
dim strSQL as string

strSQL = "SELECT Field1, Field2, Field3, etc.... FROM YourTable"

SET rs = currentDB.OpenRecordset(strSQL, dbOpenDynaset)

if rs!Field1 = SomeValue then
      Something = VBAFunction1(rs!Field2)
End if
' Do something with this calculation
' etc...

Open in new window

0
 
LVL 20

Author Comment

by:simon3270
ID: 39210348
Yes, a sample would have been more useful :-)

My problem is the time taken - it's currently over an hour to generate the 1500 results - I made the calculation a bit more complex (the ways delays were defined changed), and it was taking tens of minutes per record (I didn't wait for all 1500!).

The original table has 6 fields describing the record, then 23 pairs of start and stop dates, (11 pairs describing the processing path of one end of a circuit, 11 describing the other end, and one describing the work done on the entire circuit when both ends are complete).

A typical set of fields in the query, processing a single pair of dates for Step 04 (Step04StartDate and Step04EndDate) in the sequence (immediately follows Step 03), could be:
IIf(IsNull([Step03EndDate]),[Step03EstEndDate],[Step03EndDate]) AS Step04EstStartDate,
Span.Step04StartDate,
DLookUp("[LeadTime]","[LeadTimes]","[level] = 'Step04'") AS Step04LeadTime,
DSum("[Days]","[Delay]","[Source] = 'Span'  AND [RecID] = '" & [Span].[SpanID] & "' AND [Step] = 'Step04'") AS Step04Delays,
DateAddW('w',Val(Nz([Step04LeadTime],0))+Val(Nz([Step04Delays],0)),IIf(IsNull([Step04StartDate]),[Step04EstStartDate],[Step04StartDate])) AS Step04EstEndDate,
Span.Step04EndDate,

Open in new window

The first line uses the actual end date of Step03 if it is defined in the source table, otherwise uses the estimated end date of Step03, to get an estimated start date for this step.  Line 2 is the actual start date for this step from the source table, which may or may not be filled in.  Line 3 looks up the lead time in days for this step in a LeadTimes table (in a table so that the values can be changed to allow what-if analysis).  Line 4 adds up any extra delay days specific to this step for this record (stored in a separate Delays table, added there by the user - there may be none for this step).  Line 5 calls my VBA function to add the Lead Time and Delay values to the start date, skipping weekends, bank holidays and other "freeze" periods, to get an estimated end date.  Line 6 is a copy of the actual end date for this step from the source table (again, will only be filled in when the step actually ends).

Some of the steps are optional so have no estimated start date, but do have an estimated end date based on the actual start date if it is filled in.  Others depend on multiple end dates to calculate their estimated start date.

I made the calculation more complex by allowing delays to be defined as a number of days or by a specific end date for the delay.  I did this by changing line 4 to pass start dates and lead times for the Step and the identifying information from the DSum call to a VBA function, so that the function could then process the delays and determine whether they were day-based or date-based delays.  It was then that the processing time rocketed from  a second or two per record to 10s of minutes per record.
0
Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

 
LVL 20

Author Comment

by:simon3270
ID: 39210349
As for selection of the records I need to process, it is either all records (to generate reports), or one record (to display a form with that records actual and estimated dates).
0
 
LVL 58

Assisted Solution

by:Jim Dettman (Microsoft MVP/ EE MVE)
Jim Dettman (Microsoft MVP/ EE MVE) earned 1000 total points
ID: 39210546
Couple of comments in general.

On this:

<<My problem is that Access does not do the calculation sequentially >>
 
 When a VBA function is called from a query, it's called once if it passes no arguments to the function

 If a field argument is passed, it is called once for each row.

 If a column has criteria on it, it is called twice for each row.

On this SQL:

<<IIf(IsNull([...>>

   Using Domain function in a query (SQL Statement) is a big no-no performance wise.   The query parser cannot optimize the query in any way.

  All the domain functions represent SQL Statements and were intended to be used where SQL statements are not.  A query (which is a SQL statement) is not one of those places.

  Your Dlookup() for example should be a join to another table.

  Also, you are using an IIF().   With IIF(), both true and false arguments are evaluated whether needed or not.   Don't use IIF()'s in a SQL statement unless the arguments are simple.

  Beyond all that, follow Miriam's advice.

Jim.
0
 
LVL 58
ID: 39210555
and BTW, on this:

<<The original table has 6 fields describing the record, then 23 pairs of start and stop dates, (11 pairs describing the processing path of one end of a circuit, 11 describing the other end, and one describing the work done on the entire circuit when both ends are complete).>>

 It sounds like the table design is what we call non-normalized.   You should not have repeating fields in a table (start1, start2, start3, etc).

  This generally indicates the need to break out the data into a seperate table.

  This is most likely another reason your seeing poor query performance.

  Normalizing the table design for a relational database is critical for good performance.

  I would go with a VBA function as Miriam outlined.

Jim.
0
 
LVL 20

Author Closing Comment

by:simon3270
ID: 39210570
Thanks to both of you for the suggestions.  It looks as though I have a little work to do!
0
 
LVL 49

Expert Comment

by:Dale Fye
ID: 39210582
1.  Have you tried doing this in a series of queries rather than a single query.  With a series of queries, you can focus on specific relationships (A to B) (B to C), ...

You could also use criteria to target specific records, so you handle the records that go from A-B-C-D-E differently than those records where the sequence is A-B-E.

This would mean fewer Iif( ) functions and possible reduce the number of other function calls.
0
 
LVL 20

Author Comment

by:simon3270
ID: 39210774
@fyed: I think the number of queries would need to be quite large, and it would be hard to guarantee that records were all processed but only processed once - the A and B ends of the circuit proceed independently and usually by different routes (one end might use A-C-D-E, the other A-E, or A-B-C-D-E, or both might be A-B-E, and the ends will almost certainly be at different stages on that path).  I'm looking at VBA for the entire record and Joins for improvements in the first place (I had in fact started doing the VBA option a few weeks ago, but the Query was simpler then so the performance improvement at the time was minimal).
0
 
LVL 49

Expert Comment

by:Dale Fye
ID: 39211021
Doesn't matter how many queries it takes, but ensuring that every record is processed can be an issue.  Using a VBA procedure to process records one at a time is rarely more efficient than running queries, but is sometimes necessary.

Personally, I think your table structure is flawed (20+ different process steps with separate in and out times begs fo a structure like:

ID
Process_type (this would define the steps in the particular process.
Process_num
Step
dtIn
dtOut
0
 
LVL 20

Author Comment

by:simon3270
ID: 39211105
Yes, I think it is flawed, but we are where we are!

Having a separate Steps table might be doable (just have to make time to rewrite everything), but a couple more requirements came up this afternoon which definitely drive me towards one big VBA routine (they want another couple of steps added which have very different delay mechanisms and impacts - it's just getting too complicated to manage a query).
0
 
LVL 49

Expert Comment

by:Dale Fye
ID: 39211393
"... they want another couple of steps added ..."

Is the main reason for going with the structure I mentioned above.  You never know when someone is going to want more steps in the process (more columns).  More columns means re-coding, rewriting queries, ...

With the structure I recommended, and probably another table the describes which steps belong in which processes, you can avoid most of the coding and query re-writes.

Good luck.
0

Featured Post

Nothing ever in the clear!

This technical paper will help you implement VMware’s VM encryption as well as implement Veeam encryption which together will achieve the nothing ever in the clear goal. If a bad guy steals VMs, backups or traffic they get nothing.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
Microsoft Access has a limit of 255 columns in a single table; SQL Server allows tables with over 255 columns, but reading that data is not necessarily simple.  The final solution for this task involved creating a custom text parser and then reading…
In Microsoft Access, learn different ways of passing a string value within a string argument. Also learn what a “Type Mis-match” error is about.
Access reports are powerful and flexible. Learn how to create a query and then a grouped report using the wizard. Modify the report design after the wizard is done to make it look better. There will be another video to explain how to put the final p…

824 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question