Solved

How to improve custom made function VLOOKUP2

Posted on 2013-01-25
8
671 Views
Last Modified: 2013-01-27
Dear members of Expert Exchange!

I want to introduce custom made function VLOOKUP2. The source is http://www.planetaexcel.ru/techniques/2/100/

This function is powerful. You may:
- search in any column of range (in data base);
- define index number in search result.

Look at attached XLS-file for examples.

But there is a room for improvement (original function have slow algorithm)

Function VLOOKUP2(Table As Variant, SearchColumnNum As Long, SearchValue As Variant, _
                  N As Long, ResultColumnNum As Long)
                  
    Dim i As Long, iCount As Long
    Select Case TypeName(Table)
    Case "Range"
        For i = 1 To Table.Rows.Count
            If Table.Cells(i, SearchColumnNum) = SearchValue Then
                iCount = iCount + 1
            End If
            If iCount = N Then
                VLOOKUP2 = Table.Cells(i, ResultColumnNum)
                Exit For
            End If
        Next i
    Case "Variant()"
        For i = 1 To UBound(Table)
            If Table(i, 1) = SearchValue Then iCount = iCount + 1
            If iCount = N Then
                VLOOKUP2 = Table(i, ResultColumnNum)
                Exit For
            End If
        Next i
    End Select
End Function

Open in new window


Tasks for optimization:
- change from loop to .find (but I not know .find syntax);
- optimize search algorithm if user want to retrieve only first value in records set;
- optimize search algorithm if user want to retrieve not first value in records set;
- optimize search algorithm if user want to retrieve more as one value in records set;
- add option for return count of records set.

Thanks in advance for help!

P.S. Please sorry for my English!
VLOOKUP2-eng.xls
0
Comment
Question by:Last_Free_Man
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 24

Assisted Solution

by:Steve
Steve earned 100 total points
ID: 38820428
You will find the attached file with VLOOKUP3

This uses Array method rather than cell processing (so will be far faster)
Test it, see if it does the job.

you would not use a .find type as it will be slower than just looping an array.
Setting up and running a find would be slower than looping an in memory array.

Function VLOOKUP3(Table As Range, SearchColumnNum As Long, SearchValue As Variant, N As Long, ResultColumnNum As Long)
    Dim i As Long, iCount As Long
    Dim DataArr
    DataArr = Table.Value

        For i = 1 To UBound(DataArr)
            If DataArr(i, SearchColumnNum) = SearchValue Then
                iCount = iCount + 1
            End If
            If iCount = N Then
                VLOOKUP3 = DataArr(i, ResultColumnNum)
                Exit For
            End If
        Next i
End Function

Open in new window


This will speed up the existing function, but not really change it.
I am sure it is possible to do each of the changes you require, but I would suggest a single step at a time witha  single question for each.

To cover all the steps in one hit is a bit more than a 300 point question :)

As for speed 'increase' over 10,000 itterations:

VLOOKUP3         VLOOKUP2      
0.703                1.281      
0.609                1.508      
0.625                1.328      
0.609                1.273      
0.609                1.297              Improvement
0.631                1.3374      0.7064

Test code:
Sub SpeedTest()
xx = Timer
For x = 1 To 10000
xxx = VLOOKUP2(ActiveSheet.Range("$A$1:$D$21"), 1, "Smith", 3, 4)
Next x
Debug.Print "Time Taken: " & Format(Timer - xx, "0.000") & "s"
End Sub

Open in new window

VLOOKUP2-eng.xls
0
 
LVL 26

Accepted Solution

by:
redmondb earned 100 total points
ID: 38821339
Folks,

The attached has four methods...
 - VLookup2 - the original method, searching cell by cell.
 - VLookup3 - The_Barman's array method.
 - VLookup4 - uses Find().
 - VLookup5 - uses Match().

I don't believe that any single method is the optimum for all conditions. With the exception of the Find() method, each of the method is the best under some circumstances...
 - The Cell method is best for a large search space, looking for a single value - which is found very early. Under these circumstances, the Array method is by a long way the worst. (Which makes sense - The Cell method has no overhead, but is expensive per row, whereas the Array method has a huge overhead, but once the array is loaded searching is effectively instant.)
 - The Array method is best when searching a small number of cells - certainly less than 50. (At the other extreme, when looking for multiple hits, it's somewhere after the 4,500th occurrence that the advantage swings back the Array method.)
 - Other than the above (special?) cases, the Match method is pretty much always the fastest method.
 - There are various circumstances under which the Find method can beat the Cell or Array methods. However, a fundamental problem with Find is that you have to make a choice between searching by Values or Formulas. Values seems like the obvious choice, but that means that 12497 won't find that number if its formatted (e.g.$12,497.00) whereas searching by Formulas will. However, it would seem to be an unacceptable constraint that you couldn't search cells containing formulas, so Find is out.
 - The Cell and Array methods are case-sensitve, whereas the Find and Match methods aren't. Potentially a show-stopper.

So...
 - For case-sensitive, the Array method is best.
 - For a small table, the Array method is best.
 - For huge extremes of occurrence (e.g. the 5,000th occurrence of a value), the Array method is also best.
- For everything else, Match is best. It has the further advantage that I never saw it take a long time - the worst time was looking for the 8,000th occurrence of a value in 60,000 row which took a tenth of a second. At its best, it was 50 or more times faster than the Array method.

Before I started this, I assumed that the Array method would be the optimum approach, so Match's results were a big surprise to me. A word of warning though, the Array method is a common and well-tested method, whereas I've never seen the Match method used before.

Finally, points are officially a measure of the importance of the question to you, not the work content, so the 300 isn't an issue. However, I agree with The_Barman that you really have multiple questions here. I suggest that you treat this question as a performance one.

Once you've got the performance issues clear in your own mind (and unless performance is hugely significant, I'd go for the Array method) then open a second question for all the other functionality as a single function - it's easy to use optional parameters to include or exclude functionality for individual runs.

Regards,
Brian.VLOOKUP2-eng-V3.xlsm

Regards,
Brian.
0
 
LVL 24

Expert Comment

by:Steve
ID: 38821644
Interesting stuff there Brian,

I hadn't considered the MATCH function as I would have expected it to be in line with the FIND method and ultimatelty slower than the Array method. This is something I will certainly consider in future.

As with a lot of things in Excel, this will come down to how the function is to be used in the real application. But for me one of the things I always appreciate about this site is that you never cease to learn something new on a pretty regular basis.

Thanks,
Steve.
0
Salesforce Has Never Been Easier

Improve and reinforce salesforce training & adoption using WalkMe's digital adoption platform. Start saving on costly employee training by creating fast intuitive Walk-Thrus for Salesforce. Claim your Free Account Now

 
LVL 26

Expert Comment

by:redmondb
ID: 38821695
Steve,

I can only assume it's because Match is a pure Excel function, whereas Find isn't. (It may also help that Match has less functionality than Find().)

I suspect that from now on, my decision rules will be...
 - Use the Cell method for non-expert OP's or when the table is small enough for acceptable response times.
 - Use the Array method if the Cell method is too slow.
 - Use the Match method if the Array method is too slow.

Regards,
Brian.
0
 
LVL 92

Assisted Solution

by:Patrick Matthews
Patrick Matthews earned 100 total points
ID: 38822355
Last_Free_Man,

You got excellent advice on how to improve your function; there is not really much I could add of value.

However, I would advise you to not try and do this in a UDF, as what you are trying to do can be handled using an array formula, as described in my article here:

http://www.experts-exchange.com/Software/Office_Productivity/Office_Suites/MS_Office/Excel/A_10590-Finding-the-Nth-Lookup-Value-in-an-Excel-List.html

For example, let's say you have a list in A1:E1000.  The column you want to check for a match is C:C, the column with the value to return is B:B, and you want the value corresponding to the third match.

In this case, the formula would be:

{=INDEX($B$1:$B$1000,SMALL(IF($C$1:$C$1000="something",ROW($C$1:$C$1000),""),3),1)}

As an array formula, you would enter it without typing the curly braces, and use Ctrl+Shift+Enter to finish it rather than just Enter.  Excel will then display the curly braces to indicate the array status.

If you wanted, say, the value corresponding to the second-to-last match, then use LARGE():

{=INDEX($B$1:$B$1000,LARGE(IF($C$1:$C$1000="something",ROW($C$1:$C$1000),""),2),1)}

The attached file is your original sample, showing how to use the above techniques to get the answers you were looking for.

Q-28009228.xls

And just so you know: this approach can be extended to cover multiple conditions, such as "fetch me the third value for where Col A = Smith and Col B = whatever".  The article shows how to do that.

Patrick
0
 

Author Comment

by:Last_Free_Man
ID: 38823914
The_Barman, Brian, Patrick! THANKS!

These are ultimate answers and amazing set of solutions!
0
 

Author Comment

by:Last_Free_Man
ID: 38823932
Another issue for UDF "VLOOKUP_Plus" is add approximate search option

New related question: "Approximate search in text data (Excel VBA, Excel SQL)"
http://www.experts-exchange.com/Software/Office_Productivity/Office_Suites/MS_Office/Excel/Q_28010310.html
0
 
LVL 26

Expert Comment

by:redmondb
ID: 38824038
Thanks,  Last_Free_Man, especially for the question!
0

Featured Post

SharePoint Admin?

Enable Your Employees To Focus On The Core With Intuitive Onscreen Guidance That is With You At The Moment of Need.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
EXCEL file checking. 11 48
How to consolidate .ods files in one sheet 30 42
copy down array 24 33
Last "space" before nth "period" in a string 18 54
You need to know the location of the Office templates folder, so that when you create new templates, they are saved to that location, and thus are available for selection when creating new documents.  The steps to find the Templates folder path are …
Access developers frequently have requirements to interact with Excel (import from or output to) in their applications.  You might be able to accomplish this with the TransferSpreadsheet and OutputTo methods, but in this series of articles I will di…
This Micro Tutorial will demonstrate how to use a scrolling table in Microsoft Excel using the INDEX function.
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question