Filrering multple duplicates from an array

Posted on 2007-11-30
Last Modified: 2008-02-01
I am implementing a minimal spanning tree algorithm, and the only thing that I am stuck on is filtering out duplicate characters from an array (in the form of a queue) what I am trying to do, is dequeue the first character, compare it to all the other characters in the array then enqueue all others(by dequeueing) in a new queue. I set new queue equal to the old one in the for loop. here is the method,
public void filter()
              Queue q = new Queue();
              Queue q2 = new Queue();
              Queue q3 = new Queue();
              for(int i = 0; i < data.length; i++) //populate the queue with all vertices
                    q.enqueue(data[i].getB());  //this is how I populate the original
              }                                                                //queue from other parts of the
                                                                                                             //  progam that is working
                                                                //filter out duplicates
              int x=0,y = 0;
                    Object k = q.dequeue();
                    q3.enqueue(k);     //this is suposed to be the queue    
                                                                                              //without duplicates       
                    for(int j = 0; j < q.currentSize; j++)
                          if(!k.equals(q.getFront()))                         }

Question by:intrigue63
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
LVL 12

Accepted Solution

PCableGuy earned 500 total points
ID: 20386783
Pseudocode below, please write in the language of your choice and test it for accuracy.

While(Q is not empty)
   Object temp = Q.remove();
   while(Q is not empty)
    //This part removes the duplicates 
    If (temp equals Q.peek()) // peek, but don't remove yet
       Q.remove() // If it's equal to temp, just remove it
    }// end of inner while loop
newQ.add(temp); // enqueue temp
}// end of outer while loop

Open in new window

LVL 12

Expert Comment

ID: 20386794
Hi intrigue63,

Please disregard my pseudocode above, nothing is advancing the Q (queue) to the next value in either of the while loops, so as is it won't work. But it might give you a general idea that you need to nest two loops to remove duplicates. Sorry.

Expert Comment

ID: 20388658
If your set of possible characters is small enough, as in ASCII but not
Unicode, you can use an array that stores the queue rank of the
character rather than the character itself.
 1. Create an array of int. The size of the array should be equal to the
      number of possible characters supported by the character set,
      127 in the case of ASCII.
 2. Initialize the array to 0 to indicate character not yet found.
 3. Run through your queue using a counter to keep track of the rank of
     the character as it is popped out.
 4. Using the character value as index into the array, check if the array
     contents are still 0, and if so, store its rank. This is to ensure we retain
     the first occurence of the character and filter out the subsequent
When the loop exits, the array contains the rank of the first occurence
of each character in the queue and 0 for those characters that do not
occur in the queue.
 5. Create a new array of char. The size of the array should be equal
    to the queue size.
 6. Initialize the array to some invalid value not part of the character set,
     say 127 for ASCII. Or use an array of int and initialize to -1. This invalid
     value indicates the entry should be skipped.
 7. Run through the first array using the array index as the character value
     and pick up its contents that represent the rank in the original queue.
 8. Store this information in transposed way in the second array, using the
     rank as index into the array and the character as value to be stored.
When the loop exits, the second array will contain the desired characters
in the desired order with duplicated removed, but will contain some
interspersed invalid values at the locations where the duplicates existed.
 9. Now create a new queue which we will eventually return as the result.
10. Initialize the queue to EMPTY.
11. Run through the second array and check each character so retrieved
      to verify that it is a valid value.
12. Drop invalid values and push valid characters into the queue.
When the loop exits, you will have the desired result in the second queue.
We have avoided nested loops and obtained a linear algorithm. But this
works only for small character spaces like ASCII.

For large character spaces like Unicode, the best we can get is n*log(n).
 1. Create a single two-dimensinal array of (int,char) that will work as a
     table with two columns.
 2. Populate both the character and its queue-rank in successive locations
     of the array.
 3. Sort the array on the char column as primary and queue-rank as
     secondary. The array now has duplicate characters in
     consecutive locations and in the same order as they occured in the
     original queue.
 4. Run through the array and replace the duplicate occurences with an
     invalid value.
 5. Sort the array again, this time on queue-rank, which basically brings
     it back into its original order, but with duplicates replaced by the
     invalid value.
 6. Run through the array and push only the valid character values into
     a result queue that was initialized as EMPTY.
LVL 84

Expert Comment

ID: 20388794
you can do O(n) expected time with a hash table of seen values

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Entering a date in Microsoft Access can be tricky. A typo can cause month and day to be shuffled, entering the day only causes an error, as does entering, say, day 31 in June. This article shows how an inputmask supported by code can help the user a…
In this post we will learn different types of Android Layout and some basics of an Android App.
With the power of JIRA, there's an unlimited number of ways you can customize it, use it and benefit from it. With that in mind, there's bound to be things that I wasn't able to cover in this course. With this summary we'll look at some places to go…
I've attached the XLSM Excel spreadsheet I used in the video and also text files containing the macros used below.…

696 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question