?
Solved

Using query value in custom function arguments causes function to be called for every row rather than every group - VERY SLOW

Posted on 2003-03-11
13
Medium Priority
?
300 Views
Last Modified: 2012-06-21
I have written a function that calculates the median based on a grouping level for use in an Access query.  I have over-explained the logic I used in case there is a completely better way to do this then the way I have devised.

The way I call the function from the query causes it to run for every row
rather than every group which slows down performance to a crawl.
Essentially, for 3 groups I am running 65 queries rather than 3.

Here is an example of the table (“sheet1”) used as the base of the query.  Quotes and commas are used to differentiate the fields here but are not present in the actual dataset.

ID, Foundation, Recipient, Amount
79, “Citigroup Foundation, NY”, “American School for the Deaf, CT”,  $15,000.00
80, “Citigroup Foundation, NY”, “Asylum Hill Congregational Church, CT”, $50,000.00
81, “Citigroup Foundation, NY”,  “Ballet School of Stamford, CT”, $15,000.00
82, “Citigroup Foundation, NY”, “Boys and Girls Clubs of Hartford, CT”, $10,000.00
83, “Citigroup Foundation, NY”,  “Bridge Family Center, CT”, $16,500.00
continues to end of table with many foundations present.

Here is the SQL statement I use for the query (sheet1!Foundation is a value
in the query that is used to create the SQL statement to get the group of
records):

SELECT Sheet1.Foundation,
GetMedian("Sheet1","Amount","Foundation",sheet1!Foundation) AS Median
FROM Sheet1

The current query takes the foundation field as the grouping field and the
median function requires that I supply it with the table name (sheet1),
calculation field (amount), grouping field (foundation), and the current
value in of the grouping field (sheet1.foundation).

This query produces the median amount for every foundation in the table:

Foundation, Median
”Citigroup Foundation, NY”, $20,000.00
”Dibner Fund, Inc., The, CT”, $25,000.00

Unfortunately, including a value from the query itself appears to cause the function to be called for every row in the table which, making it very timely to use for even a small recordset.

Is there a way to have the query call the function once for each grouping level rather than for member in the group?  When I run a query without a grouping factor (and therefore do not include a value from the query), the code is run once leading me to believe that this is difference in the function call.

I've included the function code for reference.  The top half of the if statement is for calculating the median when no grouping factor is involved, while the second half handles a grouping factor.  The only meaningful difference is in the function code is the SQL statement includes a WHERE statement

Function GetMedian(tblName As String, tblValue As String, _
Optional tblGrp As String, Optional tblGrpVal As String) As Currency

Dim varData As Variant
Dim introws As Integer
Dim curMedian As Double
Dim intDim As Integer
Dim varMed() As Double
Dim strSQL As String
'Dim strtblGrpVal As String
Dim objXL As Excel.Application


Dim rst As ADODB.Recordset
Set rst = New ADODB.Recordset

rst.CursorLocation = adUseClient

If tblGrp = "" Then 'No group

    strSQL = "Select " & tblName & "." & tblValue & " from " & tblName & ""
    rst.Open strSQL, CurrentProject.Connection, adOpenKeyset,
adLockOptimistic, adCmdText

    introws = rst.RecordCount
    varData = rst.GetRows(introws, Fields:=Array(tblValue))

    intDim = introws - 1
    'Debug.Print varData(0, 3)

    Set objXL = CreateObject("Excel.Application")
    ReDim varMed(0 To intDim) As Double
        For k = 0 To introws - 1
            varMed(k) = varData(0, k)
        Next

    curMedian = Excel.WorksheetFunction.Median(varMed())

    GetMedian = curMedian

Else 'Median value based on group

    strSQL = "Select " & tblName & "." & tblValue & " from " & tblName & "
WHERE (((" & tblName & "." & tblGrp & ") = " & """" & tblGrpVal & """))"
    rst.Open strSQL, CurrentProject.Connection, adOpenKeyset,
adLockOptimistic, adCmdText

    introws = rst.RecordCount
    varData = rst.GetRows(introws, Fields:=Array(tblValue))

    intDim = introws - 1
    'Debug.Print varData(0, 3)

    Set objXL = CreateObject("Excel.Application")
    ReDim varMed(0 To intDim) As Double
        For k = 0 To introws - 1
            varMed(k) = varData(0, k)
        Next

    curMedian = Excel.WorksheetFunction.Median(varMed())

    GetMedian = curMedian

Stopping:
End If

End Function
0
Comment
Question by:pansophy
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 5
13 Comments
 
LVL 8

Expert Comment

by:dovholuk
ID: 8116903
well, if you're using access 2000 or later you can do all this in one query and it'll be might fast... (i think lol)

try this...

given data such as this (in a table name called Table3):

ID     Foundation     Recipient     Amount
1     Citigroup Foundation, NY     deaf     10000
2     Citigroup Foundation, NY     asylum     15000
3     Citigroup Foundation, NY     American School for the Deaf, CT     60000
4     Citigroup Foundation, NY     recip 4     50000
5     Citigroup Foundation, NY     recip 5     41000
6     Citigroup Foundation, NY     recip 6     50000
7     Citigroup Foundation, NY     American School for the Deaf, CT     15111

this sql in ONE query will get you what you want if i understand you correctly...

SELECT sums.Foundation, sums.Recipient, SumOfAmount, AmountCount, SumOfAmount / AmountCount AS Mean
FROM [select Foundation, Recipient, Sum (Amount) as SumOfAmount FROM Table3 GROUP BY Foundation, Recipient]. AS Sums INNER JOIN [SELECT Foundation, Recipient, Count(Amount) as AmountCount FROM table3 GROUP BY Foundation, Recipient]. AS Counts ON  Sums.Foundation = Counts.Foundation AND Sums.Recipient = Counts.Recipient;

if you change the table name (like Sheet1 or something) you need to change all the Table3 references above of couse...

let me know if you understand where i'm heading here...

give it a shot! :) Cheers,

dovholuk
0
 
LVL 8

Expert Comment

by:dovholuk
ID: 8116915
sorry that sql didn't post right... should read this:

SELECT sums.Foundation, sums.Recipient, SumOfAmount, AmountCount, SumOfAmount/AmountCount AS Mean
FROM (select Foundation, Recipient, Sum (Amount) as SumOfAmount FROM Table3 GROUP BY Foundation, Recipient) AS Sums INNER JOIN (SELECT Foundation, Recipient, Count(Amount) as AmountCount FROM table3 GROUP BY Foundation, Recipient) AS Counts ON (Sums.Foundation=Counts.Foundation) AND (Sums.Recipient=Counts.Recipient);


there were brackets [] in the first post that should have been parentheses ()... don't know HOW that happened.... but based off the data i provided above, the results would look like this:

Foundation     Recipient     SumOfAmount     AmountCount     Mean
Citigroup Foundation, NY     American School for the Deaf, CT     75111     2     37555.5
Citigroup Foundation, NY     asylum     15000     1     15000
Citigroup Foundation, NY     deaf     10000     1     10000
Citigroup Foundation, NY     recip 4     50000     1     50000
Citigroup Foundation, NY     recip 5     41000     1     41000
Citigroup Foundation, NY     recip 6     50000     1     50000

i THINK that's what you're after?
0
 

Author Comment

by:pansophy
ID: 8116963
dovholuk>  I see what you did, but I'm actually after something different.  The most important difference is that I am after the median value and not the mean.  That is the reason I have to write a function at all.

The function is designed to collect all the values from the group into an array and pass that to the Excel median function.  

A trival thing is that I want to group on Foundation, not the recipient (at least at this point).  The example rows only shows values from "Citigroup Foundation, NY" but generally in actuality there are many other foundations included.  That's my fault for giving a bad example, but the important grouping variable is foudnation, not recipient.  

Thanks.



0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
LVL 8

Expert Comment

by:dovholuk
ID: 8117042
mean, median, it's all the same... ok, maybe not... ;) so i got mean and median confused, it doesn't mean i can't help does it??? :)

since every row will try to calculate the median, how about this.

modify your method GetMedian to include 5 static values like this:

'**************** BEGIN CODE ****************
Function GetMedian( _
    tblName As String, _
    tblValue As String, _
    Optional tblGrp As String, _
    Optional tblGrpVal As String) As Currency

Static oldtblname As String
Static oldtblval As String
Static oldtblgrp As String
Static oldtblgrpval As String
Static oldresult As Currency

If (tblName <> oldtblname) And _
    (tblValue <> oldtblval) And _
    (tblGrp <> oldtblgrp) And _
    (tblGrpVal <> oldtblgrpval) Then
    GetMedian = oldresult
    Exit Function
End If

End If
'**************** END CODE ****************

then at the bottom of your method, you need to include these two lines:
   oldresult = curMedian '<-- i added this line to set the static value...
   GetMedian = curMedian

now what will this do? it will basically "cache" the results you got from the method the last time you called it... so if you call the same tblname, tblval time after time, it'll NOT do the database lookups and just return the value that you last calculated... make sense?

if not lemme know, i'll be up for a bit... (i'm on EST which is GMT- 5hrs so it's 12:18 am here)

dovholuk
0
 
LVL 8

Expert Comment

by:dovholuk
ID: 8117050
i've got some other optimizer tips for you if that doesn't speed things up dramatically but i would HAVE to believe that this will speed it up dramatically!

such as using GetRows and not a vba loop to get all the values into an array, passing EVERY  etc. etc.

again, lemme know...

dovholuk
0
 

Author Comment

by:pansophy
ID: 8117114
That certainly did something as it only ran very quickly.  The only problem is that I get 0.00 as an answer for each group.  

Here is how I inserted your code.  Do I need to do something extra for the tblname, tblgrp etc to be stored in the oldtblname, oldtblgrp for it to go to "cache"

Oh, by the way, I use GetRows to get the data from the table, but that puts the rows into a two-dimensional array.  The excel function will only accept it in a one-dimensional array, so barring a Redim that can preserve the data, I have to transfer it.  

Here is the code in case I made a mistake:

Function GetMedian(tblName As String, tblValue As String, _
Optional tblGrp As String, Optional tblGrpVal As String) As Currency

Dim varData As Variant
Dim introws As Integer
Dim curMedian As Double
Dim intDim As Integer
Dim varMed() As Double
Dim strSQL As String
'Dim strtblGrpVal As String
Dim objXL As Excel.Application


Static oldtblname As String
Static oldtblval As String
Static oldtblgrp As String
Static oldtblgrpval As String
Static oldresult As Currency

If (tblName <> oldtblname) And _
   (tblValue <> oldtblval) And _
   (tblGrp <> oldtblgrp) And _
   (tblGrpVal <> oldtblgrpval) Then
   GetMedian = oldresult
   Exit Function
End If



Dim rst As ADODB.Recordset
Set rst = New ADODB.Recordset

rst.CursorLocation = adUseClient

If tblGrp = "" Then 'No group
   
    strSQL = "Select " & tblName & "." & tblValue & " from " & tblName & ""
    rst.Open strSQL, CurrentProject.Connection, adOpenKeyset, adLockOptimistic, adCmdText

    introws = rst.RecordCount
    varData = rst.GetRows(introws, Fields:=Array(tblValue))
   
    intDim = introws - 1
    'Debug.Print varData(0, 3)
     
    Set objXL = CreateObject("Excel.Application")
    ReDim varMed(0 To intDim) As Double
        For k = 0 To introws - 1
            varMed(k) = varData(0, k)
        Next
   
    curMedian = Excel.WorksheetFunction.Median(varMed())
   
    GetMedian = curMedian

Else 'Median value based on group

    strSQL = "Select " & tblName & "." & tblValue & " from " & tblName & " WHERE (((" & tblName & "." & tblGrp & ") = " & """" & tblGrpVal & """))"
    rst.Open strSQL, CurrentProject.Connection, adOpenKeyset, adLockOptimistic, adCmdText
   
    introws = rst.RecordCount
    varData = rst.GetRows(introws, Fields:=Array(tblValue))
   
    intDim = introws - 1
    'Debug.Print varData(0, 3)
     
    Set objXL = CreateObject("Excel.Application")
    ReDim varMed(0 To intDim) As Double
        For k = 0 To introws - 1
            varMed(k) = varData(0, k)
        Next
   
    curMedian = Excel.WorksheetFunction.Median(varMed())
   
    oldresult = curMedian
    GetMedian = curMedian
   
Stopping:
End If

End Function
0
 
LVL 8

Accepted Solution

by:
dovholuk earned 1000 total points
ID: 8117166
some things i messed up... (sorry)

the if statement at the top should test EQUALITY not INEQUALITY... that's ONE reason you're getting 0.00...

next, you have a if then else condition i didn't notice that would also result in 0.00 if the condition was true, so i moved these lines (which were duplicated for "no good reason" (that i could see) outside the if...then..else  condition

   curMedian = Excel.WorksheetFunction.Median(varMed())
   GetMedian = curMedian
   oldresult = curMedian

finally, i notice that i wasn't setting the static variables, i added that too... copy / paste this method and you should be good to go!

if you can get back to me in ~10-15 minutes, i'll still be up... after that and it's night nights for me.... :)

-dovholuk


'******************* BEGIN COPY *******************

Function GetMedian(tblName As String, tblValue As String, _
Optional tblGrp As String, Optional tblGrpVal As String) As Currency

Dim varData As Variant
Dim introws As Integer
Dim curMedian As Double
Dim intDim As Integer
Dim varMed() As Double
Dim strSQL As String
'Dim strtblGrpVal As String
Dim objXL As Excel.Application


Static oldtblname As String
Static oldtblval As String
Static oldtblgrp As String
Static oldtblgrpval As String
Static oldresult As Currency

If (tblName = oldtblname) And _
  (tblValue = oldtblval) And _
  (tblGrp = oldtblgrp) And _
  (tblGrpVal = oldtblgrpval) Then
  GetMedian = oldresult
  Exit Function
End If


oldtblname = tblName
oldtblval = tblValue
oldtblgrp = tblGrp
oldtblgrpval = tblGrpVal

Dim rst As ADODB.Recordset
Set rst = New ADODB.Recordset

rst.CursorLocation = adUseClient

If tblGrp = "" Then 'No group
   
   strSQL = "Select " & tblName & "." & tblValue & " from " & tblName & ""
   rst.Open strSQL, CurrentProject.Connection, adOpenKeyset, adLockOptimistic, adCmdText

   introws = rst.RecordCount
   varData = rst.GetRows(introws, Fields:=Array(tblValue))
 
   intDim = introws - 1
   'Debug.Print varData(0, 3)
   
   Set objXL = CreateObject("Excel.Application")
   ReDim varMed(0 To intDim) As Double
       For k = 0 To introws - 1
           varMed(k) = varData(0, k)
       Next

Else 'Median value based on group

   strSQL = "Select " & tblName & "." & tblValue & " from " & tblName & " WHERE (((" & tblName & "." & tblGrp & ") = " & """" & tblGrpVal & """))"
   rst.Open strSQL, CurrentProject.Connection, adOpenKeyset, adLockOptimistic, adCmdText
   
   introws = rst.RecordCount
   varData = rst.GetRows(introws, Fields:=Array(tblValue))
 
   intDim = introws - 1
   'Debug.Print varData(0, 3)
   
   Set objXL = CreateObject("Excel.Application")
   ReDim varMed(0 To intDim) As Double
       For k = 0 To introws - 1
           varMed(k) = varData(0, k)
       Next
   
Stopping:
End If

   curMedian = Excel.WorksheetFunction.Median(varMed())
   GetMedian = curMedian
   oldresult = curMedian

End Function

'******************* END COPY *******************
0
 
LVL 8

Expert Comment

by:dovholuk
ID: 8117195
some more thoughts... if you order the result set so that the fieldnames (tblValue in the method) are all sorted ascending or descending, the "cache" will work much better...

for instance, if you have these rows...

item1
item1
item1
item2
item2
item2
item3
item3
item3

it would only hit the table three times.  now if it's NOT sorted, and you have a result set like this:

item1
item2
item3
item1
item2
item3
item1
item2
item3

it'll hit the database 9 times, and not three because the variables are changing each time... make sense?

so it'd behoove you to sort the query that you use, i'm sure you get my gist...

dovholuk
0
 

Author Comment

by:pansophy
ID: 8117243
This worked great. Thank you.  

I'm completely following you on the sorting issue.  Do you mean I should make sure the table is sorted before running the query or the query itself should be have the grouping variable sorted?

Just out of curiosity, why do the built in functions like mean and stdev work so much quicker?  For my actual table I have 152 foundations, so even with these changes the function hits the table 152 times.  Certainly much better than before though given I had 556 rows.    



0
 
LVL 8

Expert Comment

by:dovholuk
ID: 8117277
you caught me *just* before bed... i'll try to answer this tomorrow, okie dokie? i'm tired...

dovholuk
0
 

Author Comment

by:pansophy
ID: 8120038
One other thing.  If I copy the code for use with another function I get a compile error when I run the query. I tried changing the names of all the variables (including the static ones), but can only get the code to work if I delete the copied code.

I haven't worked with static values before, but do I need to do something special if I want to use this code as the basis of other functions?

0
 
LVL 8

Expert Comment

by:dovholuk
ID: 8124862
>If I copy the code for use with another function I get a compile error when I run the query

i'd need to see what you copied and how. it seems to me that the most obvious error (if you copied and pasted) is that you didn't change the name of the function, are BOTH functions named GetMedian? if so, then you need to either scope them properly by putting them into separate modules and making them PRIVATE and not PUBLIC, or you need to give them both unique names.  that's the only thing i can think of off the top of my head that would cause a compile error.  if that's not the case, post the code here and i'll take a looksie...

static variables are no different than any other variable really.  they only differ in one respect.  if you declare a variable static, when the function/sub (called a method) finishes, the value will be retained if that same method is called again.

take these two methods for example:

Public Function UseStaticVariables(Val As String) As String

     Static MyStaticVar As String

If Val <> "" Then MyStaticVar = Val

UseStaticVariables = "Static Variable " & MyStaticVar
End Function

Public Function DoNotUseStaticVariables(Val As String) As String

     Dim MyStaticVar As String

If Val <> "" Then MyStaticVar = Val

DoNotUseStaticVariables = "No Static Variable " & MyStaticVar
End Function

now use test them by going to the immediate/debug window and typing and pushing enter after each line:

?DoNotUseStaticVariables("test") 'line 1
?UseStaticVariables("test") 'line 2
?DoNotUseStaticVariables("") 'line 3
?UseStaticVariables("") 'line 4

you should get this output:
No Static Variable test '<- line 1 outupt
Static Variable test '<- line 2 outupt
No Static Variable '<- line 3 outupt
Static Variable test '<- line 4 outupt

so you see, the second time you call UseStaticVariables, even though you didn't give it a value to output, it already HAS a value to output ('test' in this case) because the value is static.

if there's one thing i'd tell you, it's this... global public variables are the work of the devil and are considered (by me) a crutch that shouldn't be employed.  global public variables are trouble just waiting to happen.  however, a global public variable would do the same job as a static variable would.  i just perfer to not use globals anywhere i can get away with it...

so that's the long and the short of it. sorry for taking so long to get back to you... i was, er AM sick today...

dovholuk
0
 

Author Comment

by:pansophy
ID: 8124887
thanks a lot dovholuk for the explanation.  I've never had to use static variables, but now that I know about them I'm sure I find ways to make use of them :)

As for the compile error, I think I wasn't very careful about changing ALL the variable names in the functions because when I was systematic with it, I was fine.  I'll experiment with the public versus private declaration as well.

Thanks again...very helpful.
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In earlier versions of Windows (XP and before), you could drag a database to the taskbar, where it would appear as a taskbar icon to open that database.  This article shows how to recreate this functionality in Windows 7 through 10.
This article describes a method of delivering Word templates for use in merging Access data to Word documents, that requires no computer knowledge on the part of the recipient -- the templates are saved in table fields, and are extracted and install…
What’s inside an Access Desktop Database. Will look at the basic interface, Navigation Pane (Database Container), Tables, Queries, Forms, Report, Macro’s, and VBA code.
Add bar graphs to Access queries using Unicode block characters. Graphs appear on every record in the color you want. Give life to numbers. Hopes this gives you ideas on visualizing your data in new ways ~ Create a calculated field in a query: …
Suggested Courses

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question