Best way to compare 2 lists(?) and prune one

Posted on 2003-11-11
Last Modified: 2010-05-01

I'm thinking about creating 2 list-like data structures to manipulate.  I don't really care what the data type is -- use whichever you think is best for this problem:

The first "list" will be populated with a bunch of integers (They represent id_numbers from a db).  This is the reference list.
The second list will also have a bunch of integers.  Neither list is too big, 50 numbers or so.

I'd like an algorithm which looks through the 2nd list, and if it finds a number NOT in the first list, removes it from the 2nd list, and also eliminates duplicates in the second list.  

Put another way, no numbers are allowed in the 2nd list UNLESS they exist in the first list, and no duplicate numbers are allowed in the 2nd list, either.

There may be duplicates in either list to begin with, but in the end the 2nd list must have only unique values.

So, if the first (reference) list has 7  5  6  2  3  4  4  1 56
and the second list has  2  3  4  75  33  6  4  2
then, in the end, the second list can only contain (order doesn't matter)
2    3    4   6

Sorry this is only worth 120 points -- that's all I have left...
Thanks for any help

Question by:vlg
  • 2
LVL 85

Expert Comment

by:Mike Tomlinson
ID: 9723021
Here is one solution...  

It makes use of collections to do a fast compare of the second list to the first.  Duplicates in the second are automatically removed because you cannot have duplicate keys.  The compare routine and the removal of items in the second collection will be EXTREMELY fast since we are doing lookups using the built in hash function in the collectiobn and removal of an item is simply changing the references in the linked list.

The limiting factor in this scheme is memory.  You essentially have to duplicate both listboxes in memory.  If both boxes are small it's not a problem.  We are sacrificing memory for speed.

Anyway, here it is.  Create a new project and on the main form and add two listboxes and a commandbutton.  Add a class module and leave it named "Class1".  

*** IMPORTANT! ***   <<<---------------------------------------
Once you have pasted the code for the class module into Class1, you have to change the properties for the newEnum() function.  Click on Tools --> Procedure Attributes and make sure newEnum is selected in the drop down box.  Then click on Advanced and change the Procedure Id to -4.  Under the attributes section, check the box that says "Hide this member" and hit OK.



' Code for "Form1"
Option Explicit

Private Sub Form_Load()
    ' So, if the first (reference) list has 7  5  6  2  3  4  4  1 56
    ' and the second list has  2  3  4  75  33  6  4  2
    ' then, in the end, the second list can only contain (order doesn't matter)
    ' 2    3    4   6
    List1.AddItem "7"
    List1.AddItem "5"
    List1.AddItem "6"
    List1.AddItem "2"
    List1.AddItem "3"
    List1.AddItem "4"
    List1.AddItem "4"
    List1.AddItem "1"
    List1.AddItem "56"
    List2.AddItem "2"
    List2.AddItem "3"
    List2.AddItem "4"
    List2.AddItem "75"
    List2.AddItem "33"
    List2.AddItem "6"
    List2.AddItem "4"
    List2.AddItem "2"
End Sub

Private Sub Command1_Click()
    Dim colList1 As New Class1
    Dim colList2 As New Class1
    Dim a As Integer
    Dim item As Variant
    ' Add items in first list box to first collection
    For a = 0 To List1.ListCount - 1
        colList1.addWithKey List1.List(a), List1.List(a)
    Next a
    ' Add items in second list box to second collection
    ' Duplicates will not be added because we can't have duplicate keys
    For a = 0 To List2.ListCount - 1
        colList2.addWithKey List2.List(a), List2.List(a)
    Next a
    ' Walk the second collection
    ' If the item from the second collection is not in the first collection,
    ' then remove it from second collection
    For Each item In colList2
        If Not colList1.exists(item) Then
            colList2.removeByKey item
        End If
    Next item
    ' Empty the second list box and populate it with whatever is left in
    ' the second collection!
    For Each item In colList2
        List2.AddItem item
    Next item
End Sub


' Code for "Class1"
Option Explicit

Private myCollection As New Collection

Public Function newEnum() As IUnknown
    On Error Resume Next
    ' Click on Tools --> Procedure Attributes --> Advanced
    ' This function should be a hidden member and the Procedure Id should be -4
    Set newEnum = myCollection.[_NewEnum]
End Function

Public Function count() As Long
    On Error Resume Next
    count = myCollection.count
End Function

Private Sub Class_Terminate()
    On Error Resume Next
    Set myCollection = Nothing
End Sub

Public Sub add(ByVal item As Variant)
    On Error Resume Next
    myCollection.add item
    Exit Sub
End Sub

Public Sub addWithKey(ByVal item As Variant, key As String)
    On Error Resume Next
    myCollection.add item, key
    Exit Sub
End Sub

Public Sub removeByKey(ByVal key As String)
    On Error Resume Next
    myCollection.remove key
End Sub

Public Sub removeByIndex(ByVal index As Long)
    On Error Resume Next
    myCollection.remove index
End Sub

Public Sub clear()
    On Error Resume Next
    'Erase the contents of the collection
    Set myCollection = Nothing
    Set myCollection = New Collection
End Sub

Public Function exists(ByVal key As String) As Boolean
    On Error GoTo exists_Error
    myCollection.add key, key
    exists = False
    myCollection.remove key
    Exit Function
    exists = True
End Function

Public Function itemByIndex(index As Long) As Variant
    If index >= 1 And index <= myCollection.count Then
        Set itemByIndex = myCollection.item(index)
        Set itemByIndex = Empty
    End If
End Function

Public Function itemByKey(key As String) As Variant
    If Me.exists(key) Then
        Set itemByKey = myCollection.item(key)
        Set itemByKey = Empty
    End If
End Function

Author Comment

ID: 9723148

That looks great -- I'll try it right now.  Can you tell me what's the deal with the newEnum() properties, hiding it, and giving it a Procedure number of -4?  Why are all those things being done?

Thanks for the explanation,

LVL 85

Accepted Solution

Mike Tomlinson earned 120 total points
ID: 9723207
That is Microsoft's brilliant way of giving the customer the ability to have a class create its own enumeration ability.  Without doing that hokey pokey dance, you cannot use the For...Next structure to enumerate the items stored in our Class1 instances.


Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
how to open from address saved in DB? 26 173
VBA saving file message display 5 56
Windows 10 start screen issues 9 51
vbModal 12 31
When trying to find the cause of a problem in VBA or VB6 it's often valuable to know what procedures were executed prior to the error. You can use the Call Stack for that but it is often inadequate because it may show procedures you aren't intereste…
You can of course define an array to hold data that is of a particular type like an array of Strings to hold customer names or an array of Doubles to hold customer sales, but what do you do if you want to coordinate that data? This article describes…
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…

929 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now