Solved

Finding the duplicates in a big collection

Posted on 2007-04-03
3
505 Views
Last Modified: 2013-11-07
Hello all,

I have a little problem.
I must find duplicates i a CollectionBase object.
Actually there are 3 properties that give the uniqueness of the records.

I am reading am XML file that gives me the collection of objects in the CollectionBase object.
Then i must "say/display" witch records are duplicated according to some TAG node values.

 "public class XmlEmployesCollection : CollectionBase"

The problem is that sometimes there are more than 15 000 objects in the XmlEmployesCollection.
What i need is some guidelines for completing this task; "Finding the duplicates in a big collection."

I am using .NET Framework v2.0 with C#.

Thanks in advance,
So.



0
Comment
Question by:barbulea
3 Comments
 
LVL 33

Expert Comment

by:raterus
ID: 18845606
Whenever I need to find duplicates, I pull out the trusty HashTable object and start adding values to it based on a "should be unique" key.  Before you add the value to the hashtable, make use of the ContainsKey method to see if you've already put it there.  If you have, you know you have a duplicate.
0
 
LVL 16

Expert Comment

by:AlexNek
ID: 18845796
For key of the 3 properties it is not so easy but you need only additional steps.
For one Key/it can be complex key too/ you have at least 2 methods
- Sort the collection by key and remove one of the same neighbour item
- When you build a collection make an additional map by key and don't add the items which are already in map
It can be binary sort with preventing item duplication too.
0
 
LVL 6

Accepted Solution

by:
thuannguy earned 500 total points
ID: 18848656
You can use three Dictionary<> to store the objects. Let's consider a concrete example in which the three "KEYS" are Age, Salary and Name
      Dictionary<string, Employee> nameDict = new Dictionary<string, Employee>();
      Dictionary<int, Employee> ageDict = new Dictionary<int, Employee>();
      Dictionary<double, Employee> salaryDict = new Dictionary<double, Employee>();
      List<Employee> duplicateList = new List<Employee>();

      public void Add(Employee employee)
      {
         bool isDuplicate = true;
         if (!nameDict.ContainsKey(employee.Name))
         {
            isDuplicate = false;
            nameDict.Add(employee.Name, employee);
         }
         
         if (!ageDict.ContainsKey(employee.Age))
         {
            isDuplicate = false;
            ageDict.Add(employee.Age, employee);
         }
         
         if (!salaryDict.ContainsKey(employee.Salary))
         {
            isDuplicate = false;
            salaryDict.Add(employee.Salary, employee);
         }
         if (isDuplicate)
              duplicateList.Add(employee);//this object is duplicate, add it to the duplicate list
}

When you read an object from the Xml file, just use the Add method to add it to the container. In the Add method, we check if the three "KEYS" already exist. Since we only store the references to the objects in the three dictionary, the memory cost is not so much.
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

More often than not, we developers are confronted with a need: a need to make some kind of magic happen via code. Whether it is for a client, for the boss, or for our own personal projects, the need must be satisfied. Most of the time, the Framework…
Calculating holidays and working days is a function that is often needed yet it is not one found within the Framework. This article presents one approach to building a working-day calculator for use in .NET.
Sending a Secure fax is easy with eFax Corporate (http://www.enterprise.efax.com). First, Just open a new email message.  In the To field, type your recipient's fax number @efaxsend.com. You can even send a secure international fax — just include t…
When you create an app prototype with Adobe XD, you can insert system screens -- sharing or Control Center, for example -- with just a few clicks. This video shows you how. You can take the full course on Experts Exchange at http://bit.ly/XDcourse.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

24 Experts available now in Live!

Get 1:1 Help Now