Dedup a List<hashTable> in C#

Posted on 2009-04-24
Last Modified: 2013-12-20
I have a requirement where i need to eliminate the duplicate from a List<hashtable>

The hashTable is a row of Data, for example if there are 500 records with around 10 fields, these 10 fields willl make a Key for hashtable and it's row data will hold in to Value.
and this row (hshtable) will be added to List, so at the end List will have 500 (hashtable) records.

i need to remove duplicates from this List<> based on certiain criterian,

How can i make this possible, please help me, i need a solution in 2 days, by monday evening i need to finish it off.

so thanks in advance

could any one suggest me any sample code

Additional infor


List<Hashtable> executeDedup(List<Hashtable> sourceData,  List<string> uniqueFieldList, DedupRuleTypes dedupRuleType)	


sourceData will have list of Hashtable<colname, colvalue> for each row of actual data

uniqueFieldList is a list of field/column names to identify a row as unique in dedup

DedupRuleTypes is enum with options - Keep First Record, Keep the most complete record

Open in new window

Question by:nithinmohantk
    LVL 26

    Expert Comment

    by:Anurag Thakur
    to be very frank i dont like the design of using a generic list for a hash table - first it completely takes out the advantages given by the typed list as boxing and unboxing still remains

    second you design can be improved a little more
    please explain your requirements first as most of the features provided by the hashtable can be achieved by just using a generic list
    LVL 1

    Author Comment

    My Requirement is we have a set of addresses or Excel sheet user uploads

    From this excel file reocrds we need to remove duplicate records
    we will read this Excel sheet and will build a hashtable , for each row of data and append it to a List<hashtable>.
    it's just a parameter i need to pass, inside the method i can import in to a generic list of my choice and do the operations..
    but my pm is specific about the input parameter type..

    duplicate removal there is an extra parameter we are passing, List<string> uniqueFieldList, which has the column names on which we need to find unique data. this is user choice, user will specify what are the columns he need to see uniqueness..

    i hope this information will help..
    LVL 26

    Accepted Solution

    now i understand your problem in a better way

    my suggestion how to handle the problem is as
    add the data you have is in a datatable (either from excel or from the addresses objects)
    then write a function for finding distinct values from the following link and get the unique values
    LVL 1

    Author Comment

    hi ragi, i will try that. but there is one more condition i need to check..

    i have two options to check, if the user specifies that keep the first record and remove rest of it.. or he can say keep the record which has more data or complete data.. this is what i'm confused a bit..

    first record i can do that. but how would i know which has the most perfect or complete data. seems bit funny right.. i can check by string char length, but it wont take me any where.

    suppose there are 2,3 records with same char length, then what i will do..

    any ideas?
    LVL 1

    Author Comment

    Hi Ragi it worked for my requirement, except the option i mentioned previously.
    to select only the most perfect data..

    anyway thanks again ragi, was a great help and i think i will modify and will find alternatives..


    Featured Post

    Looking for New Ways to Advertise?

    Engage with tech pros in our community with native advertising, as a Vendor Expert, and more.

    Join & Write a Comment

    A long time ago (May 2011), I have written an article showing you how to create a DLL using Visual Studio 2005 to be hosted in SQL Server 2005. That was valid at that time and it is still valid if you are still using these versions. You can still re…
    Creating and Managing Databases with phpMyAdmin in cPanel.
    Video by: Steve
    Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
    Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

    732 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    20 Experts available now in Live!

    Get 1:1 Help Now