## Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

• Help others & share knowledge
• Earn cash & points
Solved

# Filrering multple duplicates from an array

Posted on 2007-11-30
343 Views
I am implementing a minimal spanning tree algorithm, and the only thing that I am stuck on is filtering out duplicate characters from an array (in the form of a queue) what I am trying to do, is dequeue the first character, compare it to all the other characters in the array then enqueue all others(by dequeueing) in a new queue. I set new queue equal to the old one in the for loop. here is the method,
public void filter()
{
Queue q = new Queue();
Queue q2 = new Queue();
Queue q3 = new Queue();
for(int i = 0; i < data.length; i++) //populate the queue with all vertices
{
q.enqueue(data[i].getA());
q.enqueue(data[i].getB());  //this is how I populate the original
}                                                                //queue from other parts of the
//  progam that is working
//filter out duplicates
int x=0,y = 0;
while(!q.isEmpty())
{
Object k = q.dequeue();
q3.enqueue(k);     //this is suposed to be the queue
//without duplicates
for(int j = 0; j < q.currentSize; j++)
if(!k.equals(q.getFront()))                         }

}
}
0
Question by:intrigue63
• 2

LVL 12

Accepted Solution

PCableGuy earned 500 total points
ID: 20386783
Pseudocode below, please write in the language of your choice and test it for accuracy.

``````While(Q is not empty)
{
Object temp = Q.remove();
while(Q is not empty)
{
//This part removes the duplicates
If (temp equals Q.peek()) // peek, but don't remove yet
{
Q.remove() // If it's equal to temp, just remove it
}

}// end of inner while loop

}// end of outer while loop
``````
0

LVL 12

Expert Comment

ID: 20386794
Hi intrigue63,

Please disregard my pseudocode above, nothing is advancing the Q (queue) to the next value in either of the while loops, so as is it won't work. But it might give you a general idea that you need to nest two loops to remove duplicates. Sorry.
0

Expert Comment

ID: 20388658
If your set of possible characters is small enough, as in ASCII but not
Unicode, you can use an array that stores the queue rank of the
character rather than the character itself.
1. Create an array of int. The size of the array should be equal to the
number of possible characters supported by the character set,
127 in the case of ASCII.
2. Initialize the array to 0 to indicate character not yet found.
3. Run through your queue using a counter to keep track of the rank of
the character as it is popped out.
4. Using the character value as index into the array, check if the array
contents are still 0, and if so, store its rank. This is to ensure we retain
the first occurence of the character and filter out the subsequent
occurences.
When the loop exits, the array contains the rank of the first occurence
of each character in the queue and 0 for those characters that do not
occur in the queue.
5. Create a new array of char. The size of the array should be equal
to the queue size.
6. Initialize the array to some invalid value not part of the character set,
say 127 for ASCII. Or use an array of int and initialize to -1. This invalid
value indicates the entry should be skipped.
7. Run through the first array using the array index as the character value
and pick up its contents that represent the rank in the original queue.
8. Store this information in transposed way in the second array, using the
rank as index into the array and the character as value to be stored.
When the loop exits, the second array will contain the desired characters
in the desired order with duplicated removed, but will contain some
interspersed invalid values at the locations where the duplicates existed.
9. Now create a new queue which we will eventually return as the result.
10. Initialize the queue to EMPTY.
11. Run through the second array and check each character so retrieved
to verify that it is a valid value.
12. Drop invalid values and push valid characters into the queue.
When the loop exits, you will have the desired result in the second queue.
We have avoided nested loops and obtained a linear algorithm. But this
works only for small character spaces like ASCII.

For large character spaces like Unicode, the best we can get is n*log(n).
1. Create a single two-dimensinal array of (int,char) that will work as a
table with two columns.
2. Populate both the character and its queue-rank in successive locations
of the array.
3. Sort the array on the char column as primary and queue-rank as
secondary. The array now has duplicate characters in
consecutive locations and in the same order as they occured in the
original queue.
4. Run through the array and replace the duplicate occurences with an
invalid value.
5. Sort the array again, this time on queue-rank, which basically brings
it back into its original order, but with duplicates replaced by the
invalid value.
6. Run through the array and push only the valid character values into
a result queue that was initialized as EMPTY.
0

LVL 84

Expert Comment

ID: 20388794
you can do O(n) expected time with a hash table of seen values
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Computer science students often experience many of the same frustrations when going through their engineering courses. This article presents seven tips I found useful when completing a bachelors and masters degree in computing which I believe may he…
When there is a disconnect between the intentions of their creator and the recipient, when algorithms go awry, they can have disastrous consequences.
With the power of JIRA, there's an unlimited number of ways you can customize it, use it and benefit from it. With that in mind, there's bound to be things that I wasn't able to cover in this course. With this summary we'll look at some places to go…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…