?
Solved

Fastest way to search - 2D array!

Posted on 2004-03-27
18
Medium Priority
?
453 Views
Last Modified: 2012-05-04
Hi all...

I have a 2D array.

arr(1,0)="emp1"
arr(1,1)=20
arr(1,2)="dept"
arr(2,0)="emp2"
arr(2,1)=34
arr(2,2)="pur"
arr(3,0)="emp3"
arr(3,1)=32
arr(3,2)="sal"

and so on...something like that.  So when I get a string argument, I have to return "emp3" or "32" (either one...as specified).

Public Function GetArrayEquivalent(ByVal strString As String) As String
Dim strTmp As String
Dim lngStart As Long
Dim lngEnd As Long

lngEnd = 1000 'Number of array items/rows

strTmp = strString

For lngStart = 1 To lngEnd
    If strString  = arr(lngStart, 1) Then strTmp = arr(lngStart, 0): Exit For
Next
GetArrayEquivalent = strTmp
End Function

That method is okay.  But when I use it for searching in bulk, the program becomes very slow.  Of course there is the possibility of DoEvents, but still I feel I am not doing it correctly.  Am I correct?

So is there any way to search an array speedily?

(If I use Collections also, maybe the same thing happens internally...like the same search takes place comparing every array item till it finds the particular item?.)

krupa
0
Comment
Question by:shankarkrupa
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 5
  • 4
  • +2
18 Comments
 
LVL 6

Expert Comment

by:___XXX_X_XXX___
ID: 10694622
Do...Loop sometimes is faster than For...Next

Fastest way is to use precompiled assembler code to search array for you, but it is hard to write such code, especially for searching strings when you search multilanguage strings (not only Latin characters).
0
 
LVL 86

Assisted Solution

by:Mike Tomlinson
Mike Tomlinson earned 300 total points
ID: 10694998
A collection does not use an array internally.  It uses a linked list implementation.  If you add an item to a collection with a key value, it can be retrieved back out very quickly.  The collection uses a hash function to store the location in memory of the data item so it can be jumped back to when necessary.  For large data sets, finding an item by key in a collection is almost always faster then finding an item in an array (even if the array is sorted).

Here is an example of how to store data items in a collection and get them back out.

Create a new project and add a ListBox, two TextBoxes, and a Class Module (Class1).

' --------------------------------------------------
' Form1 Code
' --------------------------------------------------
Option Explicit

Private data As New Collection

Private Sub Form_Load()
    Dim f As Class1
     
    Set f = New Class1
    f.name = "emp1"
    f.id = 20
    f.key = "dept"
    data.Add f, f.key
   
    Set f = New Class1
    f.name = "emp2"
    f.id = 34
    f.key = "pur"
    data.Add f, f.key
   
    Set f = New Class1
    f.name = "emp3"
    f.id = 32
    f.key = "sal"
    data.Add f, f.key
   
    For Each f In data
        List1.AddItem f.key
    Next f
End Sub

Private Sub List1_Click()
    On Error GoTo noSuchItem
   
    If List1.ListIndex <> -1 Then
        Dim f As New Class1
        Set f = data.Item(List1.List(List1.ListIndex))
        If f.id <> -1 Then
            Text1.Text = f.name
            Text2.Text = f.id
        End If
    End If
    Exit Sub
   
noSuchItem:
    Text1.Text = ""
    Text2.Text = ""
End Sub

' --------------------------------------------------
' Class1 Code
' --------------------------------------------------
Option Explicit

Private strName As String
Private intId As Integer
Private strKey As String

Private Sub Class_Initialize()
    strName = ""
    intId = -1
    strKey = ""
End Sub

Public Property Let name(newName As String)
    strName = newName
End Property

Public Property Get name() As String
    name = strName
End Property

Public Property Let id(newId As Integer)
    If newId > 0 Then
        intId = newId
    End If
End Property

Public Property Get id() As Integer
    id = intId
End Property

Public Property Let key(newKey As String)
    strKey = newKey
End Property

Public Property Get key() As String
    key = strKey
End Property
0
 
LVL 15

Expert Comment

by:ameba
ID: 10695073
VB.Collection includes fast binary search and you can use it like this:
     Dim colIndices As collection
     ' build a collection
     Set colIndices = New collection
     For i=1 to ubound(arr,1)
         colIndices.Add i, "k" & arr(i,0)
     Next

It will hold the right indexes, e.g. colIndices.Item("kemp1") will return 1, so you can do this:
     myidx = colIndices.Item("kemp1")
     MsgBox arr(myidx,1) & vbCrLf & arr(myidx,2)

"k" is used to make sure key is a string, when adding to collection.

You can also use other, more complicated methods, e.g. "Speed up searches with hash tables"
http://www.vb2themax.com/Item.asp?PageID=TipBank&ID=412

Also, 2D arrays in VB are a bit slow, I suggest using three 1D arrays instead (arrNames, arrAge, arrSomething), each with its best data type, if possible.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 6

Expert Comment

by:___XXX_X_XXX___
ID: 10695165
For author:
Remember that you CAN'T add two items in collection with same keys. So you can't add this to your collection:

colCollection.Add "Some number","Key1"
colCollection.Add "Some number","Key2"
colCollection.Add "Some number","Key3"
colCollection.Add "Some number","Key2" ' An error will occur here because the key is already associated with item in collection.

Does you 2D array contains same values ? Like
arr(1,0)="emp1"
arr(1,1)=20
arr(1,2)="dept"
arr(2,0)="emp2"
arr(2,1)=34
arr(2,2)="pur"
arr(3,0)="emp3"
arr(3,1)=32
arr(3,2)="pur" ' The same value as arr(2,2), so you cannot use "pur" as key for your collection twice.


0
 
LVL 3

Author Comment

by:shankarkrupa
ID: 10695271
___XXX_X_XXX___:

Thanks for the suggestion. But machine code...me...? I am out. :-(

I tested using the Do/While..but I could not see a remarkable difference.

By the way, of course you are correct. I am searching Unicode characters (non-English). :-)

Idle_Mind:

I have not made the entire array contents into a collection object still.  I will feed all the entries first.

Actually, the search will not be based on just the third item...it can be anything.  Sometimes it may to search by giving the "emp1" string and search for the other two items, sometimes it may be "20" and search for the first and third items, sometimes it will be "dept" to search for the first or second items.

So does it mean I have to make three copies of the content and look in the appropriate collection item?

I fear it will affect the speed...

Not necessarily that I should have an array/collection sort of things. It can be anything.  Either a speedier search subroutine, or some other format/method to save the data and retrieve?

What is the best method?

krupa
0
 
LVL 15

Expert Comment

by:ameba
ID: 10695326
You can use 3 collections, or, to search anywhere, add other items to the same collection:
     For i=1 to ubound(arr,1)
         colIndices.Add i, "k" & arr(i,1)
     Next
     For i=1 to ubound(arr,1)
         colIndices.Add i, "k" & arr(i,2)
     Next
0
 
LVL 86

Expert Comment

by:Mike Tomlinson
ID: 10695353
or add the same item three times into one collection and use each value as the key:

Set f = New Class1
f.name = "emp1"
f.id = 20
f.key = "dept"
data.Add f, f.name
data.Add f, f.id
data.Add f, f.key

Even though the object has been added three times, they all reference the same object.  If you make a change however, then you would have to remove all three references using the old values and then add them back in using the new values since a collection items key cannot be changed.

As mentioned before, you can only have one item per collection with a specific key.  So this approach will only work if there are no duplicate values across your entire data set.

Idle_Mind
0
 
LVL 86

Expert Comment

by:Mike Tomlinson
ID: 10695357
you would have to change

    data.Add f, f.id

to

    data.Add f, Cstr(f.id)

because a key must be a string value
0
 
LVL 76

Expert Comment

by:David Lee
ID: 10696916
Why not store the information in a database and use its built in capabilities to find the data?
0
 
LVL 3

Author Comment

by:shankarkrupa
ID: 10704912
ameba:
---------

Your code apart, the link you gave also threw some knowledge in my mind.  Thanks.

But will this be speedier than the array method?

I am working on to convert all the 2D array items to three separate arrays, though.


___XXX_X_XXX___:
----------------------------

Of course no array item has the same value...everything is unique.  The example I gave (the employee, department, and age) is just for an example.  In reality, I am dealing with an unicode editor for my language.  We already have an 8-bit encoding standard.

While moving to unicode, we need to take care of all the three things - the Roman script, the-now-old 8-bit encoding, and the unicode.  I was afraid it may complicate  things further actually...that is why I translated the real thing to this dummy employee table.  I am very sorry if I did anything wrong, honestly I did not intend to present a false data.

Idle_Mind:
-------------

Having a unique key is not a problem for this program.  Because, as I said, it involves characters (of a spoken/written language, not the computer language :-)) so each letter is unique in itself.  All I am worried is the speed.  Really even arrays do not fit in the picture when it comes to large data transfer. :-((

Well, I should try it with collections + 1D array first, as ameba suggested.

BlueDevilFan:
-----------------

Oh no...accessing things from database will be even slower.  I am looking for someway to speed up things even quicker...quicker even than arrays or doing it better with arrays/collections itself.

krupa
0
 
LVL 3

Author Comment

by:shankarkrupa
ID: 10727048
Hello all...

I am facing a very strange problem.  When I tried to load the bunch of data to a Collection object/variable, I get an error that the key has already been defined.

I am pretty sure that the ascii values of the letters I loaded are different, but it says that it has already been added and identifies a particular set of letters with another one.

Seems I have to satisfy myself with just arrays...:-((

krupa
0
 
LVL 15

Expert Comment

by:ameba
ID: 10727944
Then you cannot use collection for your binary search - its key is not case-sensitive.
You can use mentioned hash search, or sort your array and use binary search like
"BinarySearch - Fast search in a sorted array"
http://www.vb2themax.com/Item.asp?PageID=CodeBank&ID=17

Linear search can be fast sometimes, e.g. if you are comparing LONGs, 1000 comparisons will be done in less than 1 ms - there's no point in adding to complexity, to reduce number of comparisons to 11-12, if you are happy with the speed.
0
 
LVL 3

Author Comment

by:shankarkrupa
ID: 10771349
When I try to add the data to the collection, I get the error that the key has already been added.  Apparently the key is not the same one.

The ascii value is different, apparently.

It is a big load of data, however I am posting some of it here.  When I try to load it to the collection, I get the error.

arr(1, 0) = "«"
arr(2, 0) = "¬ "
arr(3, 0) = "þ"
arr(4, 0) = "®"
arr(5, 0) = "¯"
arr(6, 0) = "°"
arr(7, 0) = "±"
arr(8, 0) = "²"
arr(9, 0) = "³"
arr(10, 0) = "´"
arr(11, 0) = "µ"
arr(12, 0) = "¶"
arr(13, 0) = "·"
arr(14, 0) = "ì"
arr(15, 0) = "í"
arr(16, 0) = "î"
arr(17, 0) = "ï"
arr(18, 0) = "ð"
arr(19, 0) = "ñ"
arr(20, 0) = "ò"
arr(21, 0) = "ó"
arr(22, 0) = "ô"
arr(23, 0) = "õ"
arr(24, 0) = "ö"
arr(25, 0) = "÷"
arr(26, 0) = "ø"
arr(27, 0) = "ù"
arr(28, 0) = "ú"
arr(29, 0) = "û"
arr(30, 0) = "ü"
arr(31, 0) = "ý"
arr(32, 0) = "&#352;"
arr(33, 0) = "&#8249;"
arr(34, 0) = "&#710;"
arr(35, 0) = "&#338;"
arr(36, 0) = "&#8240;"
arr(37, 0) = "&#8218;"
arr(38, 0) = "¸"
arr(39, 0) = "¸¡"
arr(40, 0) = "¸¢"
arr(41, 0) = "¸£"
arr(42, 0) = "Ì"
arr(43, 0) = "Ü"
arr(44, 0) = "¦¸"
arr(45, 0) = "§¸"
arr(46, 0) = "¨¸"
arr(47, 0) = "¦¸¡"
arr(48, 0) = "§¸¡"
arr(49, 0) = "¦¸ª"
arr(50, 0) = "¹"
arr(51, 0) = "¹¡"
arr(52, 0) = "¹¢"
arr(53, 0) = "¹£"
arr(54, 0) = "&#8482;"
arr(55, 0) = "&#8250;"
arr(56, 0) = "¦¹"
arr(57, 0) = "§¹"
arr(58, 0) = "¨¹"
arr(59, 0) = "¦¹¡"
arr(60, 0) = "§¹¡"
arr(61, 0) = "¦¹ª"
arr(62, 0) = "º"
arr(63, 0) = "º¡"
arr(64, 0) = "º¢"
arr(65, 0) = "º£"
arr(66, 0) = "Í"
arr(67, 0) = "Ý"
arr(68, 0) = "¦º"
arr(69, 0) = "§º"
arr(70, 0) = "¨º"
arr(71, 0) = "¦º¡"
arr(72, 0) = "§º¡"
arr(73, 0) = "¦ºª"
arr(74, 0) = "»"
arr(75, 0) = "»¡"
arr(76, 0) = "Ȣ"
arr(77, 0) = "ȣ"
arr(78, 0) = "&#353;"
arr(79, 0) = "&#339;"
arr(80, 0) = "¦»"
arr(81, 0) = "§»"
arr(82, 0) = "¨»"
arr(83, 0) = "¦»¡"
arr(84, 0) = "§»¡"
arr(85, 0) = "¦»ª"

krupa
0
 
LVL 15

Expert Comment

by:ameba
ID: 10772770
As I said, in that case you cannot use collection.  arr(14,0) and arr(42,0) are the same for the collection.
You'll have to use different method.
Vb2themax site changed and the links are not working any more.
Not sure if finding other examples will help you, or you already decided you'll use just array.
0
 
LVL 15

Accepted Solution

by:
ameba earned 300 total points
ID: 10772943
Google still has cached vb2themax articles. Here is HashTable class:
http://66.102.9.104/search?q=cache:mItWesUfFaIJ:www.vb2themax.com/Item.asp%3FID%3D411%26PageID%3DCodeBank

and here is how to use it with your values:

Option Explicit
Dim arr() As String

Private Sub Form_Click()
    Dim ht As HashTable, i As Long
    Set ht = New HashTable
   
    For i = 1 To UBound(arr)
        ht.Add "k" & arr(i, 0), i
    Next
   
    MsgBox ht.Item("k" & "Ì")  ' shows 42
End Sub

Private Sub Form_Load()
    ReDim arr(1 To 85, 0)
    arr(1, 0) = "«"
    arr(2, 0) = "¬ "
    arr(3, 0) = "þ"
    arr(4, 0) = "®"
    arr(5, 0) = "¯"
    arr(6, 0) = "°"
    arr(7, 0) = "±"
    arr(8, 0) = "²"
    arr(9, 0) = "³"
    arr(10, 0) = "´"
    arr(11, 0) = "µ"
    arr(12, 0) = "¶"
    arr(13, 0) = "·"
    arr(14, 0) = "ì"
    arr(15, 0) = "í"
    arr(16, 0) = "î"
    arr(17, 0) = "ï"
    arr(18, 0) = "ð"
    arr(19, 0) = "ñ"
    arr(20, 0) = "ò"
    arr(21, 0) = "ó"
    arr(22, 0) = "ô"
    arr(23, 0) = "õ"
    arr(24, 0) = "ö"
    arr(25, 0) = "÷"
    arr(26, 0) = "ø"
    arr(27, 0) = "ù"
    arr(28, 0) = "ú"
    arr(29, 0) = "û"
    arr(30, 0) = "ü"
    arr(31, 0) = "ý"
    arr(32, 0) = "&#352;"
    arr(33, 0) = "&#8249;"
    arr(34, 0) = "&#710;"
    arr(35, 0) = "&#338;"
    arr(36, 0) = "&#8240;"
    arr(37, 0) = "&#8218;"
    arr(38, 0) = "¸"
    arr(39, 0) = "¸¡"
    arr(40, 0) = "¸¢"
    arr(41, 0) = "¸£"
    arr(42, 0) = "Ì"
    arr(43, 0) = "Ü"
    arr(44, 0) = "¦¸"
    arr(45, 0) = "§¸"
    arr(46, 0) = "¨¸"
    arr(47, 0) = "¦¸¡"
    arr(48, 0) = "§¸¡"
    arr(49, 0) = "¦¸ª"
    arr(50, 0) = "¹"
    arr(51, 0) = "¹¡"
    arr(52, 0) = "¹¢"
    arr(53, 0) = "¹£"
    arr(54, 0) = "&#8482;"
    arr(55, 0) = "&#8250;"
    arr(56, 0) = "¦¹"
    arr(57, 0) = "§¹"
    arr(58, 0) = "¨¹"
    arr(59, 0) = "¦¹¡"
    arr(60, 0) = "§¹¡"
    arr(61, 0) = "¦¹ª"
    arr(62, 0) = "º"
    arr(63, 0) = "º¡"
    arr(64, 0) = "º¢"
    arr(65, 0) = "º£"
    arr(66, 0) = "Í"
    arr(67, 0) = "Ý"
    arr(68, 0) = "¦º"
    arr(69, 0) = "§º"
    arr(70, 0) = "¨º"
    arr(71, 0) = "¦º¡"
    arr(72, 0) = "§º¡"
    arr(73, 0) = "¦ºª"
    arr(74, 0) = "»"
    arr(75, 0) = "»¡"
    arr(76, 0) = "»¢"
    arr(77, 0) = "»£"
    arr(78, 0) = "&#353;"
    arr(79, 0) = "&#339;"
    arr(80, 0) = "¦»"
    arr(81, 0) = "§»"
    arr(82, 0) = "¨»"
    arr(83, 0) = "¦»¡"
    arr(84, 0) = "§»¡"
    arr(85, 0) = "¦»ª"
End Sub
0
 
LVL 3

Author Comment

by:shankarkrupa
ID: 10907540
Hello Ameba...

With this thread, I learned new things....it is nice.

Of course I am not an enemy to Hashtable/collections to decide not to use it. :-)
 
(just kidding)...

But there is not a remarkable performance speed when it comes to bulk search & replace....like processing 4000 or so characters.

However, I am thankful for all these offered help.

And I want to give points (split?) to both Ameba & Idle_Mind.  But I fear one of you will reject it if I post another linke for the points.  Promise (anyone of you) that you will accept if I post like "Points for <you>"...


Krupa
0
 
LVL 86

Expert Comment

by:Mike Tomlinson
ID: 10907800
You can split the points without creating a new question.

Take a look here to find out how: http://www.experts-exchange.com/Programming/Programming_Languages/Visual_Basic/help.jsp#hi69

Idle_Mind
0
 
LVL 15

Expert Comment

by:ameba
ID: 10907863
You can use Split feature - make sure grade is A, if you think it is not, you can ask for clarification.
Thanks!
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Have you ever wanted to restrict the users input in a textbox to numbers, and while doing that make sure that they can't 'cheat' by pasting in non-numeric text? Of course you can do that with code you write yourself but it's tedious and error-prone …
Enums (shorthand for ‘enumerations’) are not often used by programmers but they can be quite valuable when they are.  What are they? An Enum is just a type of variable like a string or an Integer, but in this case one that you create that contains…
Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…
Get people started with the utilization of class modules. Class modules can be a powerful tool in Microsoft Access. They allow you to create self-contained objects that encapsulate functionality. They can easily hide the complexity of a process from…
Suggested Courses

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question