Solved

Fastest way to search - 2D array!

Posted on 2004-03-27
18
437 Views
Last Modified: 2012-05-04
Hi all...

I have a 2D array.

arr(1,0)="emp1"
arr(1,1)=20
arr(1,2)="dept"
arr(2,0)="emp2"
arr(2,1)=34
arr(2,2)="pur"
arr(3,0)="emp3"
arr(3,1)=32
arr(3,2)="sal"

and so on...something like that.  So when I get a string argument, I have to return "emp3" or "32" (either one...as specified).

Public Function GetArrayEquivalent(ByVal strString As String) As String
Dim strTmp As String
Dim lngStart As Long
Dim lngEnd As Long

lngEnd = 1000 'Number of array items/rows

strTmp = strString

For lngStart = 1 To lngEnd
    If strString  = arr(lngStart, 1) Then strTmp = arr(lngStart, 0): Exit For
Next
GetArrayEquivalent = strTmp
End Function

That method is okay.  But when I use it for searching in bulk, the program becomes very slow.  Of course there is the possibility of DoEvents, but still I feel I am not doing it correctly.  Am I correct?

So is there any way to search an array speedily?

(If I use Collections also, maybe the same thing happens internally...like the same search takes place comparing every array item till it finds the particular item?.)

krupa
0
Comment
Question by:shankarkrupa
  • 6
  • 5
  • 4
  • +2
18 Comments
 
LVL 6

Expert Comment

by:___XXX_X_XXX___
Comment Utility
Do...Loop sometimes is faster than For...Next

Fastest way is to use precompiled assembler code to search array for you, but it is hard to write such code, especially for searching strings when you search multilanguage strings (not only Latin characters).
0
 
LVL 85

Assisted Solution

by:Mike Tomlinson
Mike Tomlinson earned 75 total points
Comment Utility
A collection does not use an array internally.  It uses a linked list implementation.  If you add an item to a collection with a key value, it can be retrieved back out very quickly.  The collection uses a hash function to store the location in memory of the data item so it can be jumped back to when necessary.  For large data sets, finding an item by key in a collection is almost always faster then finding an item in an array (even if the array is sorted).

Here is an example of how to store data items in a collection and get them back out.

Create a new project and add a ListBox, two TextBoxes, and a Class Module (Class1).

' --------------------------------------------------
' Form1 Code
' --------------------------------------------------
Option Explicit

Private data As New Collection

Private Sub Form_Load()
    Dim f As Class1
     
    Set f = New Class1
    f.name = "emp1"
    f.id = 20
    f.key = "dept"
    data.Add f, f.key
   
    Set f = New Class1
    f.name = "emp2"
    f.id = 34
    f.key = "pur"
    data.Add f, f.key
   
    Set f = New Class1
    f.name = "emp3"
    f.id = 32
    f.key = "sal"
    data.Add f, f.key
   
    For Each f In data
        List1.AddItem f.key
    Next f
End Sub

Private Sub List1_Click()
    On Error GoTo noSuchItem
   
    If List1.ListIndex <> -1 Then
        Dim f As New Class1
        Set f = data.Item(List1.List(List1.ListIndex))
        If f.id <> -1 Then
            Text1.Text = f.name
            Text2.Text = f.id
        End If
    End If
    Exit Sub
   
noSuchItem:
    Text1.Text = ""
    Text2.Text = ""
End Sub

' --------------------------------------------------
' Class1 Code
' --------------------------------------------------
Option Explicit

Private strName As String
Private intId As Integer
Private strKey As String

Private Sub Class_Initialize()
    strName = ""
    intId = -1
    strKey = ""
End Sub

Public Property Let name(newName As String)
    strName = newName
End Property

Public Property Get name() As String
    name = strName
End Property

Public Property Let id(newId As Integer)
    If newId > 0 Then
        intId = newId
    End If
End Property

Public Property Get id() As Integer
    id = intId
End Property

Public Property Let key(newKey As String)
    strKey = newKey
End Property

Public Property Get key() As String
    key = strKey
End Property
0
 
LVL 15

Expert Comment

by:ameba
Comment Utility
VB.Collection includes fast binary search and you can use it like this:
     Dim colIndices As collection
     ' build a collection
     Set colIndices = New collection
     For i=1 to ubound(arr,1)
         colIndices.Add i, "k" & arr(i,0)
     Next

It will hold the right indexes, e.g. colIndices.Item("kemp1") will return 1, so you can do this:
     myidx = colIndices.Item("kemp1")
     MsgBox arr(myidx,1) & vbCrLf & arr(myidx,2)

"k" is used to make sure key is a string, when adding to collection.

You can also use other, more complicated methods, e.g. "Speed up searches with hash tables"
http://www.vb2themax.com/Item.asp?PageID=TipBank&ID=412

Also, 2D arrays in VB are a bit slow, I suggest using three 1D arrays instead (arrNames, arrAge, arrSomething), each with its best data type, if possible.
0
 
LVL 6

Expert Comment

by:___XXX_X_XXX___
Comment Utility
For author:
Remember that you CAN'T add two items in collection with same keys. So you can't add this to your collection:

colCollection.Add "Some number","Key1"
colCollection.Add "Some number","Key2"
colCollection.Add "Some number","Key3"
colCollection.Add "Some number","Key2" ' An error will occur here because the key is already associated with item in collection.

Does you 2D array contains same values ? Like
arr(1,0)="emp1"
arr(1,1)=20
arr(1,2)="dept"
arr(2,0)="emp2"
arr(2,1)=34
arr(2,2)="pur"
arr(3,0)="emp3"
arr(3,1)=32
arr(3,2)="pur" ' The same value as arr(2,2), so you cannot use "pur" as key for your collection twice.


0
 
LVL 3

Author Comment

by:shankarkrupa
Comment Utility
___XXX_X_XXX___:

Thanks for the suggestion. But machine code...me...? I am out. :-(

I tested using the Do/While..but I could not see a remarkable difference.

By the way, of course you are correct. I am searching Unicode characters (non-English). :-)

Idle_Mind:

I have not made the entire array contents into a collection object still.  I will feed all the entries first.

Actually, the search will not be based on just the third item...it can be anything.  Sometimes it may to search by giving the "emp1" string and search for the other two items, sometimes it may be "20" and search for the first and third items, sometimes it will be "dept" to search for the first or second items.

So does it mean I have to make three copies of the content and look in the appropriate collection item?

I fear it will affect the speed...

Not necessarily that I should have an array/collection sort of things. It can be anything.  Either a speedier search subroutine, or some other format/method to save the data and retrieve?

What is the best method?

krupa
0
 
LVL 15

Expert Comment

by:ameba
Comment Utility
You can use 3 collections, or, to search anywhere, add other items to the same collection:
     For i=1 to ubound(arr,1)
         colIndices.Add i, "k" & arr(i,1)
     Next
     For i=1 to ubound(arr,1)
         colIndices.Add i, "k" & arr(i,2)
     Next
0
 
LVL 85

Expert Comment

by:Mike Tomlinson
Comment Utility
or add the same item three times into one collection and use each value as the key:

Set f = New Class1
f.name = "emp1"
f.id = 20
f.key = "dept"
data.Add f, f.name
data.Add f, f.id
data.Add f, f.key

Even though the object has been added three times, they all reference the same object.  If you make a change however, then you would have to remove all three references using the old values and then add them back in using the new values since a collection items key cannot be changed.

As mentioned before, you can only have one item per collection with a specific key.  So this approach will only work if there are no duplicate values across your entire data set.

Idle_Mind
0
 
LVL 85

Expert Comment

by:Mike Tomlinson
Comment Utility
you would have to change

    data.Add f, f.id

to

    data.Add f, Cstr(f.id)

because a key must be a string value
0
 
LVL 76

Expert Comment

by:David Lee
Comment Utility
Why not store the information in a database and use its built in capabilities to find the data?
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 
LVL 3

Author Comment

by:shankarkrupa
Comment Utility
ameba:
---------

Your code apart, the link you gave also threw some knowledge in my mind.  Thanks.

But will this be speedier than the array method?

I am working on to convert all the 2D array items to three separate arrays, though.


___XXX_X_XXX___:
----------------------------

Of course no array item has the same value...everything is unique.  The example I gave (the employee, department, and age) is just for an example.  In reality, I am dealing with an unicode editor for my language.  We already have an 8-bit encoding standard.

While moving to unicode, we need to take care of all the three things - the Roman script, the-now-old 8-bit encoding, and the unicode.  I was afraid it may complicate  things further actually...that is why I translated the real thing to this dummy employee table.  I am very sorry if I did anything wrong, honestly I did not intend to present a false data.

Idle_Mind:
-------------

Having a unique key is not a problem for this program.  Because, as I said, it involves characters (of a spoken/written language, not the computer language :-)) so each letter is unique in itself.  All I am worried is the speed.  Really even arrays do not fit in the picture when it comes to large data transfer. :-((

Well, I should try it with collections + 1D array first, as ameba suggested.

BlueDevilFan:
-----------------

Oh no...accessing things from database will be even slower.  I am looking for someway to speed up things even quicker...quicker even than arrays or doing it better with arrays/collections itself.

krupa
0
 
LVL 3

Author Comment

by:shankarkrupa
Comment Utility
Hello all...

I am facing a very strange problem.  When I tried to load the bunch of data to a Collection object/variable, I get an error that the key has already been defined.

I am pretty sure that the ascii values of the letters I loaded are different, but it says that it has already been added and identifies a particular set of letters with another one.

Seems I have to satisfy myself with just arrays...:-((

krupa
0
 
LVL 15

Expert Comment

by:ameba
Comment Utility
Then you cannot use collection for your binary search - its key is not case-sensitive.
You can use mentioned hash search, or sort your array and use binary search like
"BinarySearch - Fast search in a sorted array"
http://www.vb2themax.com/Item.asp?PageID=CodeBank&ID=17

Linear search can be fast sometimes, e.g. if you are comparing LONGs, 1000 comparisons will be done in less than 1 ms - there's no point in adding to complexity, to reduce number of comparisons to 11-12, if you are happy with the speed.
0
 
LVL 3

Author Comment

by:shankarkrupa
Comment Utility
When I try to add the data to the collection, I get the error that the key has already been added.  Apparently the key is not the same one.

The ascii value is different, apparently.

It is a big load of data, however I am posting some of it here.  When I try to load it to the collection, I get the error.

arr(1, 0) = "«"
arr(2, 0) = "¬ "
arr(3, 0) = "þ"
arr(4, 0) = "®"
arr(5, 0) = "¯"
arr(6, 0) = "°"
arr(7, 0) = "±"
arr(8, 0) = "²"
arr(9, 0) = "³"
arr(10, 0) = "´"
arr(11, 0) = "µ"
arr(12, 0) = "¶"
arr(13, 0) = "·"
arr(14, 0) = "ì"
arr(15, 0) = "í"
arr(16, 0) = "î"
arr(17, 0) = "ï"
arr(18, 0) = "ð"
arr(19, 0) = "ñ"
arr(20, 0) = "ò"
arr(21, 0) = "ó"
arr(22, 0) = "ô"
arr(23, 0) = "õ"
arr(24, 0) = "ö"
arr(25, 0) = "÷"
arr(26, 0) = "ø"
arr(27, 0) = "ù"
arr(28, 0) = "ú"
arr(29, 0) = "û"
arr(30, 0) = "ü"
arr(31, 0) = "ý"
arr(32, 0) = "&#352;"
arr(33, 0) = "&#8249;"
arr(34, 0) = "&#710;"
arr(35, 0) = "&#338;"
arr(36, 0) = "&#8240;"
arr(37, 0) = "&#8218;"
arr(38, 0) = "¸"
arr(39, 0) = "¸¡"
arr(40, 0) = "¸¢"
arr(41, 0) = "¸£"
arr(42, 0) = "Ì"
arr(43, 0) = "Ü"
arr(44, 0) = "¦¸"
arr(45, 0) = "§¸"
arr(46, 0) = "¨¸"
arr(47, 0) = "¦¸¡"
arr(48, 0) = "§¸¡"
arr(49, 0) = "¦¸ª"
arr(50, 0) = "¹"
arr(51, 0) = "¹¡"
arr(52, 0) = "¹¢"
arr(53, 0) = "¹£"
arr(54, 0) = "&#8482;"
arr(55, 0) = "&#8250;"
arr(56, 0) = "¦¹"
arr(57, 0) = "§¹"
arr(58, 0) = "¨¹"
arr(59, 0) = "¦¹¡"
arr(60, 0) = "§¹¡"
arr(61, 0) = "¦¹ª"
arr(62, 0) = "º"
arr(63, 0) = "º¡"
arr(64, 0) = "º¢"
arr(65, 0) = "º£"
arr(66, 0) = "Í"
arr(67, 0) = "Ý"
arr(68, 0) = "¦º"
arr(69, 0) = "§º"
arr(70, 0) = "¨º"
arr(71, 0) = "¦º¡"
arr(72, 0) = "§º¡"
arr(73, 0) = "¦ºª"
arr(74, 0) = "»"
arr(75, 0) = "»¡"
arr(76, 0) = "Ȣ"
arr(77, 0) = "ȣ"
arr(78, 0) = "&#353;"
arr(79, 0) = "&#339;"
arr(80, 0) = "¦»"
arr(81, 0) = "§»"
arr(82, 0) = "¨»"
arr(83, 0) = "¦»¡"
arr(84, 0) = "§»¡"
arr(85, 0) = "¦»ª"

krupa
0
 
LVL 15

Expert Comment

by:ameba
Comment Utility
As I said, in that case you cannot use collection.  arr(14,0) and arr(42,0) are the same for the collection.
You'll have to use different method.
Vb2themax site changed and the links are not working any more.
Not sure if finding other examples will help you, or you already decided you'll use just array.
0
 
LVL 15

Accepted Solution

by:
ameba earned 75 total points
Comment Utility
Google still has cached vb2themax articles. Here is HashTable class:
http://66.102.9.104/search?q=cache:mItWesUfFaIJ:www.vb2themax.com/Item.asp%3FID%3D411%26PageID%3DCodeBank

and here is how to use it with your values:

Option Explicit
Dim arr() As String

Private Sub Form_Click()
    Dim ht As HashTable, i As Long
    Set ht = New HashTable
   
    For i = 1 To UBound(arr)
        ht.Add "k" & arr(i, 0), i
    Next
   
    MsgBox ht.Item("k" & "Ì")  ' shows 42
End Sub

Private Sub Form_Load()
    ReDim arr(1 To 85, 0)
    arr(1, 0) = "«"
    arr(2, 0) = "¬ "
    arr(3, 0) = "þ"
    arr(4, 0) = "®"
    arr(5, 0) = "¯"
    arr(6, 0) = "°"
    arr(7, 0) = "±"
    arr(8, 0) = "²"
    arr(9, 0) = "³"
    arr(10, 0) = "´"
    arr(11, 0) = "µ"
    arr(12, 0) = "¶"
    arr(13, 0) = "·"
    arr(14, 0) = "ì"
    arr(15, 0) = "í"
    arr(16, 0) = "î"
    arr(17, 0) = "ï"
    arr(18, 0) = "ð"
    arr(19, 0) = "ñ"
    arr(20, 0) = "ò"
    arr(21, 0) = "ó"
    arr(22, 0) = "ô"
    arr(23, 0) = "õ"
    arr(24, 0) = "ö"
    arr(25, 0) = "÷"
    arr(26, 0) = "ø"
    arr(27, 0) = "ù"
    arr(28, 0) = "ú"
    arr(29, 0) = "û"
    arr(30, 0) = "ü"
    arr(31, 0) = "ý"
    arr(32, 0) = "&#352;"
    arr(33, 0) = "&#8249;"
    arr(34, 0) = "&#710;"
    arr(35, 0) = "&#338;"
    arr(36, 0) = "&#8240;"
    arr(37, 0) = "&#8218;"
    arr(38, 0) = "¸"
    arr(39, 0) = "¸¡"
    arr(40, 0) = "¸¢"
    arr(41, 0) = "¸£"
    arr(42, 0) = "Ì"
    arr(43, 0) = "Ü"
    arr(44, 0) = "¦¸"
    arr(45, 0) = "§¸"
    arr(46, 0) = "¨¸"
    arr(47, 0) = "¦¸¡"
    arr(48, 0) = "§¸¡"
    arr(49, 0) = "¦¸ª"
    arr(50, 0) = "¹"
    arr(51, 0) = "¹¡"
    arr(52, 0) = "¹¢"
    arr(53, 0) = "¹£"
    arr(54, 0) = "&#8482;"
    arr(55, 0) = "&#8250;"
    arr(56, 0) = "¦¹"
    arr(57, 0) = "§¹"
    arr(58, 0) = "¨¹"
    arr(59, 0) = "¦¹¡"
    arr(60, 0) = "§¹¡"
    arr(61, 0) = "¦¹ª"
    arr(62, 0) = "º"
    arr(63, 0) = "º¡"
    arr(64, 0) = "º¢"
    arr(65, 0) = "º£"
    arr(66, 0) = "Í"
    arr(67, 0) = "Ý"
    arr(68, 0) = "¦º"
    arr(69, 0) = "§º"
    arr(70, 0) = "¨º"
    arr(71, 0) = "¦º¡"
    arr(72, 0) = "§º¡"
    arr(73, 0) = "¦ºª"
    arr(74, 0) = "»"
    arr(75, 0) = "»¡"
    arr(76, 0) = "»¢"
    arr(77, 0) = "»£"
    arr(78, 0) = "&#353;"
    arr(79, 0) = "&#339;"
    arr(80, 0) = "¦»"
    arr(81, 0) = "§»"
    arr(82, 0) = "¨»"
    arr(83, 0) = "¦»¡"
    arr(84, 0) = "§»¡"
    arr(85, 0) = "¦»ª"
End Sub
0
 
LVL 3

Author Comment

by:shankarkrupa
Comment Utility
Hello Ameba...

With this thread, I learned new things....it is nice.

Of course I am not an enemy to Hashtable/collections to decide not to use it. :-)
 
(just kidding)...

But there is not a remarkable performance speed when it comes to bulk search & replace....like processing 4000 or so characters.

However, I am thankful for all these offered help.

And I want to give points (split?) to both Ameba & Idle_Mind.  But I fear one of you will reject it if I post another linke for the points.  Promise (anyone of you) that you will accept if I post like "Points for <you>"...


Krupa
0
 
LVL 85

Expert Comment

by:Mike Tomlinson
Comment Utility
You can split the points without creating a new question.

Take a look here to find out how: http://www.experts-exchange.com/Programming/Programming_Languages/Visual_Basic/help.jsp#hi69

Idle_Mind
0
 
LVL 15

Expert Comment

by:ameba
Comment Utility
You can use Split feature - make sure grade is A, if you think it is not, you can ask for clarification.
Thanks!
0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

Have you ever wanted to restrict the users input in a textbox to numbers, and while doing that make sure that they can't 'cheat' by pasting in non-numeric text? Of course you can do that with code you write yourself but it's tedious and error-prone …
Background What I'm presenting in this article is the result of 2 conditions in my work area: We have a SQL Server production environment but no development or test environment; andWe have an MS Access front end using tables in SQL Server but we a…
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
Show developers how to use a criteria form to limit the data that appears on an Access report. It is a common requirement that users can specify the criteria for a report at runtime. The easiest way to accomplish this is using a criteria form that a…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now