Link to home
Start Free TrialLog in
Avatar of conorocallaghan
conorocallaghan

asked on

Best performance duplication prevention a collection of objects?

I've created an application which scans through a log file line by line and creates an object representing each line... I.e the object will contain date strings line number etc...

 I wish to prevent duplicates of a certain type. i.e duplicate objects would have the same date but different string and line numbers....

I've been using a ArrayList of objects and each time before I add an object I check if the object is in the arraylist already.. I've overloaded the equals() function and I'm using the arraylist.contains(object) to check for the duplicate...

My problem is that the arraylist can have up to 60,000 elements, so I have to check for duplicates each time before I add an element..

Anyone know of a more efficient way of preventing duplicates because this is realy inefficient.
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

You can use a Set. The Set cannot contain duplicates
Avatar of aozarov
aozarov

Yes, as mentioned above, HashSet would be your best bet.
HashSet set = new HashSet();

if (set.contains(key))
// skip this
else
{
set.add(key);
continue logic...
}
ASKER CERTIFIED SOLUTION
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I assume that set is not pre populated but get populated as it goes thru the log file.
And in that case an action might be taken when line already exists... (and not necessarily override the old line with the new one).
:-) but why a 'C'?