Filrering multple duplicates from an array

Posted on 2007-11-30
Last Modified: 2008-02-01
I am implementing a minimal spanning tree algorithm, and the only thing that I am stuck on is filtering out duplicate characters from an array (in the form of a queue) what I am trying to do, is dequeue the first character, compare it to all the other characters in the array then enqueue all others(by dequeueing) in a new queue. I set new queue equal to the old one in the for loop. here is the method,
public void filter()
              Queue q = new Queue();
              Queue q2 = new Queue();
              Queue q3 = new Queue();
              for(int i = 0; i < data.length; i++) //populate the queue with all vertices
                    q.enqueue(data[i].getB());  //this is how I populate the original
              }                                                                //queue from other parts of the
                                                                                                             //  progam that is working
                                                                //filter out duplicates
              int x=0,y = 0;
                    Object k = q.dequeue();
                    q3.enqueue(k);     //this is suposed to be the queue    
                                                                                              //without duplicates       
                    for(int j = 0; j < q.currentSize; j++)
                          if(!k.equals(q.getFront()))                         }

Question by:intrigue63
  • 2
LVL 12

Accepted Solution

PCableGuy earned 500 total points
ID: 20386783
Pseudocode below, please write in the language of your choice and test it for accuracy.

While(Q is not empty)


   Object temp = Q.remove();

   while(Q is not empty)


    //This part removes the duplicates 

    If (temp equals Q.peek()) // peek, but don't remove yet


       Q.remove() // If it's equal to temp, just remove it


    }// end of inner while loop

newQ.add(temp); // enqueue temp

}// end of outer while loop

Open in new window

LVL 12

Expert Comment

ID: 20386794
Hi intrigue63,

Please disregard my pseudocode above, nothing is advancing the Q (queue) to the next value in either of the while loops, so as is it won't work. But it might give you a general idea that you need to nest two loops to remove duplicates. Sorry.

Expert Comment

ID: 20388658
If your set of possible characters is small enough, as in ASCII but not
Unicode, you can use an array that stores the queue rank of the
character rather than the character itself.
 1. Create an array of int. The size of the array should be equal to the
      number of possible characters supported by the character set,
      127 in the case of ASCII.
 2. Initialize the array to 0 to indicate character not yet found.
 3. Run through your queue using a counter to keep track of the rank of
     the character as it is popped out.
 4. Using the character value as index into the array, check if the array
     contents are still 0, and if so, store its rank. This is to ensure we retain
     the first occurence of the character and filter out the subsequent
When the loop exits, the array contains the rank of the first occurence
of each character in the queue and 0 for those characters that do not
occur in the queue.
 5. Create a new array of char. The size of the array should be equal
    to the queue size.
 6. Initialize the array to some invalid value not part of the character set,
     say 127 for ASCII. Or use an array of int and initialize to -1. This invalid
     value indicates the entry should be skipped.
 7. Run through the first array using the array index as the character value
     and pick up its contents that represent the rank in the original queue.
 8. Store this information in transposed way in the second array, using the
     rank as index into the array and the character as value to be stored.
When the loop exits, the second array will contain the desired characters
in the desired order with duplicated removed, but will contain some
interspersed invalid values at the locations where the duplicates existed.
 9. Now create a new queue which we will eventually return as the result.
10. Initialize the queue to EMPTY.
11. Run through the second array and check each character so retrieved
      to verify that it is a valid value.
12. Drop invalid values and push valid characters into the queue.
When the loop exits, you will have the desired result in the second queue.
We have avoided nested loops and obtained a linear algorithm. But this
works only for small character spaces like ASCII.

For large character spaces like Unicode, the best we can get is n*log(n).
 1. Create a single two-dimensinal array of (int,char) that will work as a
     table with two columns.
 2. Populate both the character and its queue-rank in successive locations
     of the array.
 3. Sort the array on the char column as primary and queue-rank as
     secondary. The array now has duplicate characters in
     consecutive locations and in the same order as they occured in the
     original queue.
 4. Run through the array and replace the duplicate occurences with an
     invalid value.
 5. Sort the array again, this time on queue-rank, which basically brings
     it back into its original order, but with duplicates replaced by the
     invalid value.
 6. Run through the array and push only the valid character values into
     a result queue that was initialized as EMPTY.
LVL 84

Expert Comment

ID: 20388794
you can do O(n) expected time with a hash table of seen values

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
stringclean challenge 26 70
Homework Help 5 72
Export Table to CSV - Access to CSV - using python 18 100
Programatically extract date from website 8 67
Iteration: Iteration is repetition of a process. A student who goes to school repeats the process of going to school everyday until graduation. We go to grocery store at least once or twice a month to buy products. We repeat this process every mont…
This article will show, step by step, how to integrate R code into a R Sweave document
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

920 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now