Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 302
  • Last Modified:

Using query value in custom function arguments causes function to be called for every row rather than every group - VERY SLOW

I have written a function that calculates the median based on a grouping level for use in an Access query.  I have over-explained the logic I used in case there is a completely better way to do this then the way I have devised.

The way I call the function from the query causes it to run for every row
rather than every group which slows down performance to a crawl.
Essentially, for 3 groups I am running 65 queries rather than 3.

Here is an example of the table (“sheet1”) used as the base of the query.  Quotes and commas are used to differentiate the fields here but are not present in the actual dataset.

ID, Foundation, Recipient, Amount
79, “Citigroup Foundation, NY”, “American School for the Deaf, CT”,  $15,000.00
80, “Citigroup Foundation, NY”, “Asylum Hill Congregational Church, CT”, $50,000.00
81, “Citigroup Foundation, NY”,  “Ballet School of Stamford, CT”, $15,000.00
82, “Citigroup Foundation, NY”, “Boys and Girls Clubs of Hartford, CT”, $10,000.00
83, “Citigroup Foundation, NY”,  “Bridge Family Center, CT”, $16,500.00
continues to end of table with many foundations present.

Here is the SQL statement I use for the query (sheet1!Foundation is a value
in the query that is used to create the SQL statement to get the group of
records):

SELECT Sheet1.Foundation,
GetMedian("Sheet1","Amount","Foundation",sheet1!Foundation) AS Median
FROM Sheet1

The current query takes the foundation field as the grouping field and the
median function requires that I supply it with the table name (sheet1),
calculation field (amount), grouping field (foundation), and the current
value in of the grouping field (sheet1.foundation).

This query produces the median amount for every foundation in the table:

Foundation, Median
”Citigroup Foundation, NY”, $20,000.00
”Dibner Fund, Inc., The, CT”, $25,000.00

Unfortunately, including a value from the query itself appears to cause the function to be called for every row in the table which, making it very timely to use for even a small recordset.

Is there a way to have the query call the function once for each grouping level rather than for member in the group?  When I run a query without a grouping factor (and therefore do not include a value from the query), the code is run once leading me to believe that this is difference in the function call.

I've included the function code for reference.  The top half of the if statement is for calculating the median when no grouping factor is involved, while the second half handles a grouping factor.  The only meaningful difference is in the function code is the SQL statement includes a WHERE statement

Function GetMedian(tblName As String, tblValue As String, _
Optional tblGrp As String, Optional tblGrpVal As String) As Currency

Dim varData As Variant
Dim introws As Integer
Dim curMedian As Double
Dim intDim As Integer
Dim varMed() As Double
Dim strSQL As String
'Dim strtblGrpVal As String
Dim objXL As Excel.Application


Dim rst As ADODB.Recordset
Set rst = New ADODB.Recordset

rst.CursorLocation = adUseClient

If tblGrp = "" Then 'No group

    strSQL = "Select " & tblName & "." & tblValue & " from " & tblName & ""
    rst.Open strSQL, CurrentProject.Connection, adOpenKeyset,
adLockOptimistic, adCmdText

    introws = rst.RecordCount
    varData = rst.GetRows(introws, Fields:=Array(tblValue))

    intDim = introws - 1
    'Debug.Print varData(0, 3)

    Set objXL = CreateObject("Excel.Application")
    ReDim varMed(0 To intDim) As Double
        For k = 0 To introws - 1
            varMed(k) = varData(0, k)
        Next

    curMedian = Excel.WorksheetFunction.Median(varMed())

    GetMedian = curMedian

Else 'Median value based on group

    strSQL = "Select " & tblName & "." & tblValue & " from " & tblName & "
WHERE (((" & tblName & "." & tblGrp & ") = " & """" & tblGrpVal & """))"
    rst.Open strSQL, CurrentProject.Connection, adOpenKeyset,
adLockOptimistic, adCmdText

    introws = rst.RecordCount
    varData = rst.GetRows(introws, Fields:=Array(tblValue))

    intDim = introws - 1
    'Debug.Print varData(0, 3)

    Set objXL = CreateObject("Excel.Application")
    ReDim varMed(0 To intDim) As Double
        For k = 0 To introws - 1
            varMed(k) = varData(0, k)
        Next

    curMedian = Excel.WorksheetFunction.Median(varMed())

    GetMedian = curMedian

Stopping:
End If

End Function
0
pansophy
Asked:
pansophy
  • 8
  • 5
1 Solution
 
dovholukCommented:
well, if you're using access 2000 or later you can do all this in one query and it'll be might fast... (i think lol)

try this...

given data such as this (in a table name called Table3):

ID     Foundation     Recipient     Amount
1     Citigroup Foundation, NY     deaf     10000
2     Citigroup Foundation, NY     asylum     15000
3     Citigroup Foundation, NY     American School for the Deaf, CT     60000
4     Citigroup Foundation, NY     recip 4     50000
5     Citigroup Foundation, NY     recip 5     41000
6     Citigroup Foundation, NY     recip 6     50000
7     Citigroup Foundation, NY     American School for the Deaf, CT     15111

this sql in ONE query will get you what you want if i understand you correctly...

SELECT sums.Foundation, sums.Recipient, SumOfAmount, AmountCount, SumOfAmount / AmountCount AS Mean
FROM [select Foundation, Recipient, Sum (Amount) as SumOfAmount FROM Table3 GROUP BY Foundation, Recipient]. AS Sums INNER JOIN [SELECT Foundation, Recipient, Count(Amount) as AmountCount FROM table3 GROUP BY Foundation, Recipient]. AS Counts ON  Sums.Foundation = Counts.Foundation AND Sums.Recipient = Counts.Recipient;

if you change the table name (like Sheet1 or something) you need to change all the Table3 references above of couse...

let me know if you understand where i'm heading here...

give it a shot! :) Cheers,

dovholuk
0
 
dovholukCommented:
sorry that sql didn't post right... should read this:

SELECT sums.Foundation, sums.Recipient, SumOfAmount, AmountCount, SumOfAmount/AmountCount AS Mean
FROM (select Foundation, Recipient, Sum (Amount) as SumOfAmount FROM Table3 GROUP BY Foundation, Recipient) AS Sums INNER JOIN (SELECT Foundation, Recipient, Count(Amount) as AmountCount FROM table3 GROUP BY Foundation, Recipient) AS Counts ON (Sums.Foundation=Counts.Foundation) AND (Sums.Recipient=Counts.Recipient);


there were brackets [] in the first post that should have been parentheses ()... don't know HOW that happened.... but based off the data i provided above, the results would look like this:

Foundation     Recipient     SumOfAmount     AmountCount     Mean
Citigroup Foundation, NY     American School for the Deaf, CT     75111     2     37555.5
Citigroup Foundation, NY     asylum     15000     1     15000
Citigroup Foundation, NY     deaf     10000     1     10000
Citigroup Foundation, NY     recip 4     50000     1     50000
Citigroup Foundation, NY     recip 5     41000     1     41000
Citigroup Foundation, NY     recip 6     50000     1     50000

i THINK that's what you're after?
0
 
pansophyAuthor Commented:
dovholuk>  I see what you did, but I'm actually after something different.  The most important difference is that I am after the median value and not the mean.  That is the reason I have to write a function at all.

The function is designed to collect all the values from the group into an array and pass that to the Excel median function.  

A trival thing is that I want to group on Foundation, not the recipient (at least at this point).  The example rows only shows values from "Citigroup Foundation, NY" but generally in actuality there are many other foundations included.  That's my fault for giving a bad example, but the important grouping variable is foudnation, not recipient.  

Thanks.



0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
dovholukCommented:
mean, median, it's all the same... ok, maybe not... ;) so i got mean and median confused, it doesn't mean i can't help does it??? :)

since every row will try to calculate the median, how about this.

modify your method GetMedian to include 5 static values like this:

'**************** BEGIN CODE ****************
Function GetMedian( _
    tblName As String, _
    tblValue As String, _
    Optional tblGrp As String, _
    Optional tblGrpVal As String) As Currency

Static oldtblname As String
Static oldtblval As String
Static oldtblgrp As String
Static oldtblgrpval As String
Static oldresult As Currency

If (tblName <> oldtblname) And _
    (tblValue <> oldtblval) And _
    (tblGrp <> oldtblgrp) And _
    (tblGrpVal <> oldtblgrpval) Then
    GetMedian = oldresult
    Exit Function
End If

End If
'**************** END CODE ****************

then at the bottom of your method, you need to include these two lines:
   oldresult = curMedian '<-- i added this line to set the static value...
   GetMedian = curMedian

now what will this do? it will basically "cache" the results you got from the method the last time you called it... so if you call the same tblname, tblval time after time, it'll NOT do the database lookups and just return the value that you last calculated... make sense?

if not lemme know, i'll be up for a bit... (i'm on EST which is GMT- 5hrs so it's 12:18 am here)

dovholuk
0
 
dovholukCommented:
i've got some other optimizer tips for you if that doesn't speed things up dramatically but i would HAVE to believe that this will speed it up dramatically!

such as using GetRows and not a vba loop to get all the values into an array, passing EVERY  etc. etc.

again, lemme know...

dovholuk
0
 
pansophyAuthor Commented:
That certainly did something as it only ran very quickly.  The only problem is that I get 0.00 as an answer for each group.  

Here is how I inserted your code.  Do I need to do something extra for the tblname, tblgrp etc to be stored in the oldtblname, oldtblgrp for it to go to "cache"

Oh, by the way, I use GetRows to get the data from the table, but that puts the rows into a two-dimensional array.  The excel function will only accept it in a one-dimensional array, so barring a Redim that can preserve the data, I have to transfer it.  

Here is the code in case I made a mistake:

Function GetMedian(tblName As String, tblValue As String, _
Optional tblGrp As String, Optional tblGrpVal As String) As Currency

Dim varData As Variant
Dim introws As Integer
Dim curMedian As Double
Dim intDim As Integer
Dim varMed() As Double
Dim strSQL As String
'Dim strtblGrpVal As String
Dim objXL As Excel.Application


Static oldtblname As String
Static oldtblval As String
Static oldtblgrp As String
Static oldtblgrpval As String
Static oldresult As Currency

If (tblName <> oldtblname) And _
   (tblValue <> oldtblval) And _
   (tblGrp <> oldtblgrp) And _
   (tblGrpVal <> oldtblgrpval) Then
   GetMedian = oldresult
   Exit Function
End If



Dim rst As ADODB.Recordset
Set rst = New ADODB.Recordset

rst.CursorLocation = adUseClient

If tblGrp = "" Then 'No group
   
    strSQL = "Select " & tblName & "." & tblValue & " from " & tblName & ""
    rst.Open strSQL, CurrentProject.Connection, adOpenKeyset, adLockOptimistic, adCmdText

    introws = rst.RecordCount
    varData = rst.GetRows(introws, Fields:=Array(tblValue))
   
    intDim = introws - 1
    'Debug.Print varData(0, 3)
     
    Set objXL = CreateObject("Excel.Application")
    ReDim varMed(0 To intDim) As Double
        For k = 0 To introws - 1
            varMed(k) = varData(0, k)
        Next
   
    curMedian = Excel.WorksheetFunction.Median(varMed())
   
    GetMedian = curMedian

Else 'Median value based on group

    strSQL = "Select " & tblName & "." & tblValue & " from " & tblName & " WHERE (((" & tblName & "." & tblGrp & ") = " & """" & tblGrpVal & """))"
    rst.Open strSQL, CurrentProject.Connection, adOpenKeyset, adLockOptimistic, adCmdText
   
    introws = rst.RecordCount
    varData = rst.GetRows(introws, Fields:=Array(tblValue))
   
    intDim = introws - 1
    'Debug.Print varData(0, 3)
     
    Set objXL = CreateObject("Excel.Application")
    ReDim varMed(0 To intDim) As Double
        For k = 0 To introws - 1
            varMed(k) = varData(0, k)
        Next
   
    curMedian = Excel.WorksheetFunction.Median(varMed())
   
    oldresult = curMedian
    GetMedian = curMedian
   
Stopping:
End If

End Function
0
 
dovholukCommented:
some things i messed up... (sorry)

the if statement at the top should test EQUALITY not INEQUALITY... that's ONE reason you're getting 0.00...

next, you have a if then else condition i didn't notice that would also result in 0.00 if the condition was true, so i moved these lines (which were duplicated for "no good reason" (that i could see) outside the if...then..else  condition

   curMedian = Excel.WorksheetFunction.Median(varMed())
   GetMedian = curMedian
   oldresult = curMedian

finally, i notice that i wasn't setting the static variables, i added that too... copy / paste this method and you should be good to go!

if you can get back to me in ~10-15 minutes, i'll still be up... after that and it's night nights for me.... :)

-dovholuk


'******************* BEGIN COPY *******************

Function GetMedian(tblName As String, tblValue As String, _
Optional tblGrp As String, Optional tblGrpVal As String) As Currency

Dim varData As Variant
Dim introws As Integer
Dim curMedian As Double
Dim intDim As Integer
Dim varMed() As Double
Dim strSQL As String
'Dim strtblGrpVal As String
Dim objXL As Excel.Application


Static oldtblname As String
Static oldtblval As String
Static oldtblgrp As String
Static oldtblgrpval As String
Static oldresult As Currency

If (tblName = oldtblname) And _
  (tblValue = oldtblval) And _
  (tblGrp = oldtblgrp) And _
  (tblGrpVal = oldtblgrpval) Then
  GetMedian = oldresult
  Exit Function
End If


oldtblname = tblName
oldtblval = tblValue
oldtblgrp = tblGrp
oldtblgrpval = tblGrpVal

Dim rst As ADODB.Recordset
Set rst = New ADODB.Recordset

rst.CursorLocation = adUseClient

If tblGrp = "" Then 'No group
   
   strSQL = "Select " & tblName & "." & tblValue & " from " & tblName & ""
   rst.Open strSQL, CurrentProject.Connection, adOpenKeyset, adLockOptimistic, adCmdText

   introws = rst.RecordCount
   varData = rst.GetRows(introws, Fields:=Array(tblValue))
 
   intDim = introws - 1
   'Debug.Print varData(0, 3)
   
   Set objXL = CreateObject("Excel.Application")
   ReDim varMed(0 To intDim) As Double
       For k = 0 To introws - 1
           varMed(k) = varData(0, k)
       Next

Else 'Median value based on group

   strSQL = "Select " & tblName & "." & tblValue & " from " & tblName & " WHERE (((" & tblName & "." & tblGrp & ") = " & """" & tblGrpVal & """))"
   rst.Open strSQL, CurrentProject.Connection, adOpenKeyset, adLockOptimistic, adCmdText
   
   introws = rst.RecordCount
   varData = rst.GetRows(introws, Fields:=Array(tblValue))
 
   intDim = introws - 1
   'Debug.Print varData(0, 3)
   
   Set objXL = CreateObject("Excel.Application")
   ReDim varMed(0 To intDim) As Double
       For k = 0 To introws - 1
           varMed(k) = varData(0, k)
       Next
   
Stopping:
End If

   curMedian = Excel.WorksheetFunction.Median(varMed())
   GetMedian = curMedian
   oldresult = curMedian

End Function

'******************* END COPY *******************
0
 
dovholukCommented:
some more thoughts... if you order the result set so that the fieldnames (tblValue in the method) are all sorted ascending or descending, the "cache" will work much better...

for instance, if you have these rows...

item1
item1
item1
item2
item2
item2
item3
item3
item3

it would only hit the table three times.  now if it's NOT sorted, and you have a result set like this:

item1
item2
item3
item1
item2
item3
item1
item2
item3

it'll hit the database 9 times, and not three because the variables are changing each time... make sense?

so it'd behoove you to sort the query that you use, i'm sure you get my gist...

dovholuk
0
 
pansophyAuthor Commented:
This worked great. Thank you.  

I'm completely following you on the sorting issue.  Do you mean I should make sure the table is sorted before running the query or the query itself should be have the grouping variable sorted?

Just out of curiosity, why do the built in functions like mean and stdev work so much quicker?  For my actual table I have 152 foundations, so even with these changes the function hits the table 152 times.  Certainly much better than before though given I had 556 rows.    



0
 
dovholukCommented:
you caught me *just* before bed... i'll try to answer this tomorrow, okie dokie? i'm tired...

dovholuk
0
 
pansophyAuthor Commented:
One other thing.  If I copy the code for use with another function I get a compile error when I run the query. I tried changing the names of all the variables (including the static ones), but can only get the code to work if I delete the copied code.

I haven't worked with static values before, but do I need to do something special if I want to use this code as the basis of other functions?

0
 
dovholukCommented:
>If I copy the code for use with another function I get a compile error when I run the query

i'd need to see what you copied and how. it seems to me that the most obvious error (if you copied and pasted) is that you didn't change the name of the function, are BOTH functions named GetMedian? if so, then you need to either scope them properly by putting them into separate modules and making them PRIVATE and not PUBLIC, or you need to give them both unique names.  that's the only thing i can think of off the top of my head that would cause a compile error.  if that's not the case, post the code here and i'll take a looksie...

static variables are no different than any other variable really.  they only differ in one respect.  if you declare a variable static, when the function/sub (called a method) finishes, the value will be retained if that same method is called again.

take these two methods for example:

Public Function UseStaticVariables(Val As String) As String

     Static MyStaticVar As String

If Val <> "" Then MyStaticVar = Val

UseStaticVariables = "Static Variable " & MyStaticVar
End Function

Public Function DoNotUseStaticVariables(Val As String) As String

     Dim MyStaticVar As String

If Val <> "" Then MyStaticVar = Val

DoNotUseStaticVariables = "No Static Variable " & MyStaticVar
End Function

now use test them by going to the immediate/debug window and typing and pushing enter after each line:

?DoNotUseStaticVariables("test") 'line 1
?UseStaticVariables("test") 'line 2
?DoNotUseStaticVariables("") 'line 3
?UseStaticVariables("") 'line 4

you should get this output:
No Static Variable test '<- line 1 outupt
Static Variable test '<- line 2 outupt
No Static Variable '<- line 3 outupt
Static Variable test '<- line 4 outupt

so you see, the second time you call UseStaticVariables, even though you didn't give it a value to output, it already HAS a value to output ('test' in this case) because the value is static.

if there's one thing i'd tell you, it's this... global public variables are the work of the devil and are considered (by me) a crutch that shouldn't be employed.  global public variables are trouble just waiting to happen.  however, a global public variable would do the same job as a static variable would.  i just perfer to not use globals anywhere i can get away with it...

so that's the long and the short of it. sorry for taking so long to get back to you... i was, er AM sick today...

dovholuk
0
 
pansophyAuthor Commented:
thanks a lot dovholuk for the explanation.  I've never had to use static variables, but now that I know about them I'm sure I find ways to make use of them :)

As for the compile error, I think I wasn't very careful about changing ALL the variable names in the functions because when I was systematic with it, I was fine.  I'll experiment with the public versus private declaration as well.

Thanks again...very helpful.
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

  • 8
  • 5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now