RobertNZana
asked on
How parse out first/last name?
Using an access 2007 database. There is a field 'contact' that has hundreds of names in it. There are many formats:
Mr. FirstName LastName
Mrs. FirstName LastName
FirstName LastName
FirstName LastName-LastName2
FirstName LastName LastName2 Jr/Sr
I want to automatically get them into 2 columns: firstname, lastname
First, I am going to strip out ALL Mr., Mrs., Ms. to make it simple. So I know the first token is FirstName and the REST is LastName.
How can I accomplish this? Please be detailed.
Mr. FirstName LastName
Mrs. FirstName LastName
FirstName LastName
FirstName LastName-LastName2
FirstName LastName LastName2 Jr/Sr
I want to automatically get them into 2 columns: firstname, lastname
First, I am going to strip out ALL Mr., Mrs., Ms. to make it simple. So I know the first token is FirstName and the REST is LastName.
How can I accomplish this? Please be detailed.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
RobertNZana said:
>>First, I am going to strip out ALL Mr., Mrs., Ms. to make it simple. So I know the first token is FirstName and
>>the REST is LastName.
With respect, you know nothing of the sort. You yourself indicated that some of the names have suffixes such
as Jr/Sr (and one can imagine III, IV, etc., and/or Ph.D., Esq., M.D, etc.).
There is also the possibility of first and/or last names composed of multiple words. For example:
Oscar de la Renta
We "know" just by looking at that name that Oscar is the first name and de la Renta is the surname. However,
suppose we had:
John Paul Van Stone
How could we write a rule that acurately determines where the first name ends and the surname starts?
>>First, I am going to strip out ALL Mr., Mrs., Ms. to make it simple. So I know the first token is FirstName and
>>the REST is LastName.
With respect, you know nothing of the sort. You yourself indicated that some of the names have suffixes such
as Jr/Sr (and one can imagine III, IV, etc., and/or Ph.D., Esq., M.D, etc.).
There is also the possibility of first and/or last names composed of multiple words. For example:
Oscar de la Renta
We "know" just by looking at that name that Oscar is the first name and de la Renta is the surname. However,
suppose we had:
John Paul Van Stone
How could we write a rule that acurately determines where the first name ends and the surname starts?
Dear RobertNZana,
is this a task to be done once or will it be done on a regular base?
How many names do you have in fact?
Say you have about 500 names and you have to transform it once I would suggest you to do this in excel and paste the result into your access.
Because: What about Names like John F. Kenedy or Lydia Parker Bowls? You are not able to let this be done by a simple query.
Best Regards
Andreas
is this a task to be done once or will it be done on a regular base?
How many names do you have in fact?
Say you have about 500 names and you have to transform it once I would suggest you to do this in excel and paste the result into your access.
Because: What about Names like John F. Kenedy or Lydia Parker Bowls? You are not able to let this be done by a simple query.
Best Regards
Andreas
This forum is replete with questions similar to this one, and of course, just a wide a variety of answers to try and cover all the possibilities as Patrick suggests. In Search, by entering - parse first name and last name - I got over 2300 hits.
Maybe you will find one in the list from the Search that you like ;-)
as the other experts have noted, other variations of names will not be easy to parse.
to handle the FirstName LastName (single Name and single LastName format) and ignore the rest (which you have to deal with separately, run this query)
Update tableX
Set FName=Trim(Left([Contact], Instr([Con tact], " ")-1)),
LName=Trim(Mid([Contact],I nstr([Cont act], " ")+1))
Where Instr([Contact], " ")=Instrrev([Contact], " ")
to handle the FirstName LastName (single Name and single LastName format) and ignore the rest (which you have to deal with separately, run this query)
Update tableX
Set FName=Trim(Left([Contact],
LName=Trim(Mid([Contact],I
Where Instr([Contact], " ")=Instrrev([Contact], " ")
I should have verified this - but I was working on the assumption that the asker is planning on creating a mail-merge - in which case, he would need the first name to be separate of the rest of the name. So names like "Oscar de la Renta" are not an issue - First Name gets Oscar, last name gets everything else.
If that's not the purpose, then for only 500 or so names, I would export the data into an excel file and break it up based on spaces - then do some manual tweaking but ensure the fields for Salutation, FirstName, MiddleName(s), LastName, and Suffix were al; correct (I've done this with THOUSANDS of names). By doing this in excel, those that don't conform are fairly easily found and adjusted.
In the end though, you need someone reviewing EVERYTHING when you don't have a properly designed database where the initial layout did not specify those fields in the beginning.
If that's not the purpose, then for only 500 or so names, I would export the data into an excel file and break it up based on spaces - then do some manual tweaking but ensure the fields for Salutation, FirstName, MiddleName(s), LastName, and Suffix were al; correct (I've done this with THOUSANDS of names). By doing this in excel, those that don't conform are fairly easily found and adjusted.
In the end though, you need someone reviewing EVERYTHING when you don't have a properly designed database where the initial layout did not specify those fields in the beginning.
ASKER
Thanks for your feedback. This is a one time import of data. I will mull over your comments and award points soon. Thanks.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
RobertNZana,
Here are some results using sample data I contrived to hit various issues you will have to face.
Note the result for "Missy Andrea Wiggin". The code assumes a multi-word last name here, but it could just
as easily be a multi-word first name...
Patrick
Here are some results using sample data I contrived to hit various issues you will have to face.
Note the result for "Missy Andrea Wiggin". The code assumes a multi-word last name here, but it could just
as easily be a multi-word first name...
Patrick
NameColumn Title FirstName Surname Suffix
----------------------------------------------------------------------------
Dr. Andrew Wiggin Dr Andrew Wiggin
Dr Andrew Wiggin Dr Andrew Wiggin
Andrew Wiggin, M.D. Andrew Wiggin M.D.
Andrew Wiggin, MD Andrew Wiggin MD
Andrew Wiggin MD Andrew Wiggin MD
Mr. Andrew Wiggin Mr Andrew Wiggin
Mr Andrew Wiggin Mr Andrew Wiggin
Missy Andrea Wiggin Missy Andrea Wiggin
Ms. Andrea Wiggin Ms Andrea Wiggin
Ms Andrea Wiggin Ms Andrea Wiggin
Mrs. Andrea Wiggin Mrs Andrea Wiggin
Mrs Andrea Wiggin Mrs Andrea Wiggin
Andrea Wiggin, PhD Andrea Wiggin PhD
Andrea Wiggin, Ph.D. Andrea Wiggin Ph.D.
Andrea Wiggin, Esq Andrea Wiggin Esq
Andrea Wiggin, Esq. Andrea Wiggin Esq.
Andrea Wiggin, Esquire Andrea Wiggin Esquire
Andrea Wiggin Ph D Andrea Wiggin Ph D
Andrea Wiggin Ph.D. Andrea Wiggin Ph.D.
Andrea Wiggin Esq Andrea Wiggin Esq
Andrea Wiggin Esq. Andrea Wiggin Esq.
Andrea Wiggin Esquire Andrea Wiggin Esquire
Mrs. Andrea Wiggin III Mrs Andrea Wiggin III
Mrs. Andrea Wiggin iv Mrs Andrea Wiggin iv
Mrs. Andrea Wiggin Jr. Mrs Andrea Wiggin Jr.
Mrs. Andrea Wiggin Sr Mrs Andrea Wiggin Sr
Mrs. Andrea Wiggin Sr. Mrs Andrea Wiggin Sr.
Rev. Andrew Wiggin-Smith Rev Andrew Wiggin-Smith
Rev Andrew Wiggin Rev Andrew Wiggin
Jean-Claude Van Dam Jean-Claude Van Dam
Mr Oscar de la Renta Mr Oscar de la Renta
ASKER
Got it! Thanks
I would run a series of SQL statements with replace:
Update TableName Set Contact = Replace(Contact, 'Mr. ', '')
Update TableName Set Contact = Replace(Contact, 'Mrs. ', '')
Update TableName Set Contact = Replace(Contact, 'Ms. ', '')
Update TableName Set Contact = Replace(Contact, 'Dr. ', '')
Update TableName Set Contact = Replace(Contact, 'Mr ', '')
Update TableName Set Contact = Replace(Contact, 'Mrs ', '')
Update TableName Set Contact = Replace(Contact, 'Ms ', '')
Update TableName Set Contact = Replace(Contact, 'Dr ', '')
etc
(It's important that you include the space following the salutation - skipping it will result in Mr. John Drake looking like Mr. John ake)