Solved

Removing DUpicate Entries in an Array

Posted on 2000-03-27
10
179 Views
Last Modified: 2010-05-02
I am writing a VB app that will take an external file and read it into an array.  
I then want to parse the array and remove any duplicate entries.  The entries are name, email addy pairs, and I am only concerned with duplicate email addys and that is why I am stepping by 2 through the arrays. I have made 2 identical copies of the array and have been attempting to do it like this:

Dim myComp
For i = 2 To k Step 2
   For j = 4 To k Step 2
       If i = j Then
       j = j + 2
       End If
       If array1(i) = "" Then
       j = 30 'this is past the end of the array and needs to be more dynamic as the size of the array will be.
       End If
       myComp = StrComp(array1(i), array2(j), vbTextCompare)
       If myComp = 0 Then
       array2(j) = ""
       array1(j) = ""
       End If
    Next
Next


Unfortunately, that doesn't work, and I am really not sure why.  Doing it this way is not set in stone and if someone knows of a better algorithm to accomplish this that is fine, or if you know how to modify what I have to remove the duplicates that would be great.
Thanks,
Chris

0
Comment
Question by:churley
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
  • 2
10 Comments
 
LVL 17

Expert Comment

by:calacuccia
ID: 2662562
Hi Churley,

This shorter code should do the job. You actually only need one array, and must better use the LBound and UBound functions (LBound is lowest array index, often 0, where UBound returns the highest array index).

Also, you only need forward checkng, that is if you have compared 2 with 6, you don't have to compare 6 with 2 anymore. The If array1(i)= "" loop is not necessary neither.

Dim myComp
For i = LBound(array1) + 1 To UBound(array1) Step 2
   For j = i + 2 To UBound(array1) Step 2
       myComp = StrComp(array1(i), array1(j), vbTextCompare)
       If myComp = 0 Then
       array1(j) = ""
       End If
    Next
Next



Hope this helps

Calacuccia
0
 

Author Comment

by:churley
ID: 2662636
Actually, that throws it into an infinite loop in the 2nd For loop.  So I put a constant upper bound in and it doesn't infinite loop but it doesn't pull out the duplicates either.
Thanks.
0
 
LVL 17

Expert Comment

by:calacuccia
ID: 2662677
Hi Chris

Do not understand very well, it certainly worked when I tested it...

Could you tell how you output the array once it has been handled ?

The 2nd loop should never be infinite as it will only run from i (which is going up itself) to the upper bound of the array. The only thing I can think of is the very large size of your array which makes it look infinite ?

Are the duplicate records exact copies, and are you sure its the 2nd of the pair and not the first. Could depend on how your array is initialized..

Calacuccia
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Expert Comment

by:tcornett
ID: 2662679
I am not sure if this will work in your situation but....

Have you tried sorting the items and then stepping through each one and checking it against the previous?  If it matches the previous, then it is a duplicate and can be deleted.  The loop to delete the duplicates would need to run from lbound+1 to ubound.

hope this helps.

Good luck and best regards,

- Tom
0
 

Expert Comment

by:tcornett
ID: 2662689
One more thought, while you are sorting the array, you could put another condition into the sorting algo that you use stating if = then delete.

- Tom
0
 

Author Comment

by:churley
ID: 2662722
calacuccia -
Yeah I am sure it is the second and not the first, I used the debugger.  I think you are right though, it is a very large array(1 million max) and that is probably what made it seem infinite.

I am outputting to an external file using the following:

Count1 = 1
Open "single.txt" For Output As FileNum
l = 0
Do Until l > k
 l = l + 2
 If array2(l) = "" Then
 l = l + 2
 Else
 Print #FileNum, array2(l - 1)
 Print #FileNum, array2(l)
 
 Count1 = Count1 + 1
 End If
Loop
Close FileNum
0
 

Author Comment

by:churley
ID: 2662754
Actually, this may help....here is my entire code segment for the remove button with yours added in and my old one commented out.  Maybe this will help make some sense of it.

Private Sub cmdRemove_Click()
FileNum = FreeFile
Open "entries.txt" For Input As FileNum
Do Until EOF(FileNum)
    k = k + 1
    Line Input #FileNum, NextLine
    LinesFromFile = LinesFromFile + NextLine + Chr(13) '+ Chr(10)
    If LinesFromFile = "" Then
    EOF (FileNum)
    Else
    array1(k) = LinesFromFile
    array2(k) = LinesFromFile
    LinesFromFile = ""
    End If
Loop
i = 2
j = 4


Dim myComp
For i = LBound(array1) + 1 To UBound(array1) Step 2
   For j = i + 2 To UBound(array1) Step 2
       myComp = StrComp(array1(i), array1(j), vbTextCompare)
       If myComp = 0 Then
       array1(j) = "" 
       End If
    Next
Next





'Dim myComp
'For i = 2 To k Step 2
'   For j = 4 To k Step 2
'       If i = j Then
'       j = j + 2
'       End If
'       If array1(i) = "" Then
'       j = 30
'       End If
'       myComp = StrComp(array1(i), array2(j), vbTextCompare)
'       If myComp = 0 Then
'       array2(j) = ""
'       array1(j) = ""
'       End If
'    Next
'Next

Close FileNum
     
Count1 = 1
Open "single.txt" For Output As FileNum
l = 0
Do Until l > k
 l = l + 2
 If array1(l) = "" Then
 l = l + 2
 Else
 Print #FileNum, array1(l - 1)
 Print #FileNum, array1(l)
 
 Count1 = Count1 + 1
 End If
Loop
Close FileNum
cmdRemove.Enabled = False
cmdChoose.Enabled = True

End Sub
0
 
LVL 17

Accepted Solution

by:
calacuccia earned 100 total points
ID: 2662789
Hi churley,

I would add in the start of your sub a declaration for k

Dim k As integer

That will automatically set k to 0 at the start of your macro.

and then test again.

If that is not satisfactory, try to alter the start of the loop (first loop) as follows:

For i = LBound(array1)  To UBound(array1) Step 2

or

For i = LBound(array1) + 2 To UBound(array1) Step 2
   
You are right that k could be used instead of UBound.

Short of that, I don't see anything.

Hope this helps

Calacuccia
0
 

Author Comment

by:churley
ID: 2662797
Actually....i just figured it out...my problem wasn't where i thought it was....it was in my output to the file section...

I had this:
Count1 = 1
Open "single.txt" For Output As FileNum
l = 0
Do Until l > k
 l = l + 2
 If array2(l) = "" Then
 l = l + 2
 Else
 Print #FileNum, array2(l - 1)
 Print #FileNum, array2(l)
 
 Count1 = Count1 + 1
 End If
Loop


in my if statement where I increment l, i then bypassed my print statement and thus it omitted good instances of the array.  I added:

Print #FileNum, array2(l - 1)
Print #FileNum, array2(l)
after the l=1+2 statement and that fixed it.  I am going to give you the points though since you took the time to help and in all reality I asked the wrong question.
Thanks.
Chris
0
 
LVL 17

Expert Comment

by:calacuccia
ID: 2662804
Thanks in return.

And indeed, the l+2 was double used, and by-passed some of the good instances. Well spot, did not see it neither.

Calacuccia
0

Featured Post

Online Training Solution

Drastically shorten your training time with WalkMe's advanced online training solution that Guides your trainees to action. Forget about retraining and skyrocket knowledge retention rates.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction While answering a recent question about filtering a custom class collection, I realized that this could be accomplished with very little code by using the ScriptControl (SC) library.  This article will introduce you to the SC library a…
When designing a form there are several BorderStyles to choose from, all of which can be classified as either 'Fixed' or 'Sizable' and I'd guess that 'Fixed Single' or one of the other fixed types is the most popular choice. I assume it's the most p…
Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…
Get people started with the process of using Access VBA to control Excel using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Excel. Using automation, an Access application can laun…

696 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question