Solved

Comparing 2 CSV (Text Files) with fuzzy hashing , possible?

Posted on 2009-06-28
6
396 Views
Last Modified: 2012-05-07
Hi Guys ,
i was asked by my boss to take 2 files , which are actually user repository files.

1 is hr user repository , the other is AD User repository.

i have 2 csv's which include the following
FirstName , FirstName2 , LastName , LastName2 , EmployeeID

Objective : i need to find which users in the the AD file , do not have a user in the HR File (which means they have a user but they are not workers)

Problem one - i tried using Contains and compared First Name to first name , and last name to last name , however - sometimes there are more then one first or last names , so i need to do more checks.

the first check however is the employee id - its not always there , but its 100% correct.

2nd Problem is typos - in one file there is

Arik , John , Smith
and the other has
Aric , John , Smith

is there a way to find % of matching? using some kind of fuzzy hashing?

im open to all ideas , maybe i aint seein the full picture

help is much appirciated ;)
0
Comment
Question by:m0tek
  • 3
  • 2
6 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 24733100
perldoc -q "How can I do approximate matching"
       See the module String::Approx available from CPAN.
0
 
LVL 40

Expert Comment

by:mrjoltcola
ID: 24733210
Check out Tie::Hash::Abbrev, it might be useful.

http://search.cpan.org/~fany/Tie-Hash-Array-0.1/lib/Tie/Hash/Abbrev.pm


0
 
LVL 40

Expert Comment

by:mrjoltcola
ID: 24733214
Another article with some sample code

http://www.perlmonks.org/?node_id=300129
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 24733238
0
 

Author Comment

by:m0tek
ID: 24734174
is there any premade thing using one of those?

this one seems very similar to what i need

http://www.perlmonks.org/?node_id=300129

(i dont know peral or anything , yet :( )
0
 
LVL 40

Expert Comment

by:mrjoltcola
ID: 24742105
I don't think you will have much luck if you don't know Perl. Did you ask this in the Perl zone by mistake?
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Flask is a microframework for Python based on Werkzeug and Jinja 2. This requires you to have a good understanding of Python 2.7. Lets install Flask! To install Flask you can use a python repository for libraries tool called pip. Download this f…
Deploying a Microsoft Access application in a Citrix environment is not difficult but takes a few steps. However, Citrix system people are often of little help, as they typically know next to nothing about Access. The script provided here will take …
Learn the basics of lists in Python. Lists, as their name suggests, are a means for ordering and storing values. : Lists are declared using brackets; for example: t = [1, 2, 3]: Lists may contain a mix of data types; for example: t = ['string', 1, T…
Learn the basics of while and for loops in Python.  while loops are used for testing while, or until, a condition is met: The structure of a while loop is as follows:     while <condition>:         do something         repeate: The break statement m…

929 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now