Link to home
Start Free TrialLog in
Avatar of OrdinaryGeek
OrdinaryGeek

asked on

Distinct values in a Java vector

I want to get the distinct rows in a Java Vector.   For instance, I have data that looks something like this:

23000  XXXXX  YYYYYY  SSSSS
23000 WWWW XXXXX  RRRRR
45600 XXXXX   XXXXX  SSSSS
87000 XXXXX   YYYYY   SSSSS
87000 YYYYY    XXXXX  RRRRR

The first column with the numbers is the one that I want to be distinct.  I only want the results to be the following in a new Vector:

23000
45600
87000  

I'm not sure how else to describe what I am wanting to do (this is an over-simplified example of the data) and I don't know the proper way to approach this.   I cannot do a distinct on my SQL statement that gives me the data in the first place because I have some other filtering that I must do in the Java code.

Please tell me how I can do a distinct by incrementing through a Java vector.  This is probably obvious using some type of sort method but I'm brain-fried today and can't figure it out.  :-)

Avatar of nicola_mazbar
nicola_mazbar
Flag of United States of America image

if you're just after the numbers in the example, you could simply add them to an implementation of a Set, and then create a Vector using that set, using Vector(Collection c).

Sets do not contain duplicates, so there would be no need to sort the data in order to eliminate duplicates.
Avatar of CEHJ
You need to use a LinkedHashSet if you want them in the same order:

Set unique = new LinkedHashSet();
String s = (String)v.get(index);
unique.add(s.substring(0, s.lastIndexOf(' '));
Following on what nicola_mazbar posted above, your code would look something like:

Set ids = new HashSet();
Vector v = new Vector();
while (rs.next())
{
   String id = rs.next();
   if (!ids.contains(id))
   {
       ids.add(id);
       // add row to Vector
   }
}

// vector now only contians unique ids
Do you want all columns in the resulting Vector or just the first btw?
Avatar of OrdinaryGeek
OrdinaryGeek

ASKER

I'm only going to display the numbers, and the way I have the data there is not the same order/format of the data I am using.  Due to security reasons I cannot post the actual data I am using, so I had to make up something.  

Hashsets look overly complicated for what I am trying to accomplish.  Isn't there a simpler way of eliminating rows with duplicate order numbers (that's what I will call the first value in the row for lack of a better term)?   I don't particularly care about efficient code, just that it is easy for me (and whoever comes after me to edit this code) to understand.

Also, that order number is in the form of a String, not an Integer.


> Isn't there a simpler way of eliminating rows with duplicate order numbers

The HashSet allows you to easily not insert them in the first place.
Once you've finished reading the data you don't need the set anymore, and your Vector will have no dupes.
Yes, you don't have to use a Set, and if you only want the first values:


Vector unique = new Vector();

....
while (rs.next()) {
    String s = (String)rs.get(1);
    if (!unique.contains(s)) {
      unique.add(s);
    }      
}
This may sound crazy, but what does "rs" represent in the code snippets?  I don't see an object declared anywhere.  Is that the original vector?


Its your ResultSet, assuming your're reading the values from the database.
you can replace the loop with whatever you are reading the data from.
Actually, I already have the data in a Vector.  
Reading from a Vector and the end result needs to be a Vector is what I meant to say.
then you could something more like this to remove dupes from Vector

Set ids = new HashSet();
Iterator i = vector.iterator();
while (i.hasNext()) {
   String next = (String) i.next();
   if (ids.contains(next))
      // id already in Vector, so remove
      next.remove();
   } else {
      // store id in set
      ids.add(next);
   }
}

Yeah, that makes more sense, I'll try that snippet Objects.  
>> I already have the data in a Vector.
>> Reading from a Vector and the end result needs to be a Vector is what I meant to say.

But that initial vector was populated from a result-set, right? It might be better to populate a Set instead, in the first place and then make a Vector out of it, is what experts intended to say.
>>Actually, I already have the data in a Vector.  

The data represent rows, so unless you have concatenated them, you should have a Vector of Vector - right? In which case:


Vector unique = new Vector();
Iterator rows = vectorOfVectors.iterator();
....
while (rows.hasNext()) {
      Vector row = (Vector)rows.next();
      String s = (String)row.get(0);
      if (!unique.contains(s)) {
            unique.add(s);
      }    
}
What we normally do is populate Vectors using QuerySpecs.  And no, it is not a Vector of a Vectors.  If I pull the data directly out of the Vector (and cast it to a particular type), it looks like it does in my original post.  

I'm relatively new with Java and only make use of our existing factories to write business logic - I don't have anything to do with the architecture or design that we use, that is handled by an entirely different group of people.  I can only use what they have made available to me in the factories to write my code.  The way we get data from the databases is handled by the factories and I just use a QuerySpec to get the data from a particular factory - hope that makes sense, it doesn't always make sense to me.  lol

ASKER CERTIFIED SOLUTION
Avatar of Mayank S
Mayank S
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Mayan, well I originally tried that method.  The only problem is that I just want to make sure that another row with the same order number does not end up in the new Vector.  When you check the 2nd Vector with contains it is going to see all of the rows in the 1st Vector as unique and proceed to add them all to the 2nd Vector.
>> I just want to make sure that another row with the same order number does not end up in the new Vector.

You could define your equals () method that way.
If the whole row is a String then first split() it to get the id

Set ids = new HashSet();
Iterator i = vector.iterator();
while (i.hasNext()) {
   String row = (String) i.next();
   String next = row.split("\s*")[0];
   if (ids.contains(next))
      // id already in Vector, so remove
      i.remove();
   } else {
      // store id in set
      ids.add(next);
   }
}
Glad to help - any reason for the B grade? You can ask for more clarification if you want.
I had to change the code to get it to remove duplicates.   I can't really go into more detail without sharing my specific code, which I cannot do because of regulations at where I work.  Thanks.  
code i posted already removed the dupes :)