Splitting of Suffix

vrider1
vrider1 used Ask the Experts™
on
I am looking for some PL SQL code to string off the a part of the Last Name field
the entry may look like this :
"de la Rosa Martinez Jr" or like "de la Rosa Martinez Rodriguez III"

I like the last name and the Suffix separated. Suffix values would be "Jr, Sr, II, III etc".
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Top Expert 2009

Commented:
SQL> select regexp_substr(name, '(Jr|Sr|II|III)$') as suffix from names;

SUFFIX
--------------------------------------------------------------------------------
Jr

II


So you can use it in PL/SQL:

declare
  var suffix;
begin
  suffix := regexp_substr(...);
  ...
end;
/
Top Expert 2009

Commented:
Also consider normalizing to uppercase or lowercase to catch all variations:

regexp_substr(upper(name), '(JR|SR|II|III)$')
Top Expert 2009
Commented:
Oops, add a space or it will grab portions of the name, like:  YASR (SR)

ltrim(regexp_substr(upper(name), '\s(JR|SR|II|III)$'))
Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Naveen KumarProduction Manager / Application Support Manager

Commented:
you can even try :

select substr('de la Rosa Martinez Rodriguez III',instr('de la Rosa Martinez Rodriguez III',' ',-1)+1)
from dual;

This works without using regular expressions assuming the last word of the given string is the one which you wanted to extract.

in pl/sql, it will be as shown below :

declare
  suffix varchar2(100);
begin
  suffix :=  substr('de la Rosa Martinez Rodriguez III',instr('de la Rosa Martinez Rodriguez III',' ',-1)+1);
  dbms_output.put_line(suffix);
end;
/

Thanks,

Author

Commented:
SELECT   fdr_personnel.pers_id,
              INITCAP(TRANSLATE (
                         SUBSTR (first_name,
                                 1,
                                 INSTR (first_name || ' ', ' ') - 1),
                         'A.',
                         'A'
                      ))
                 AS first_name,
              INITCAP(DECODE (
                         SUBSTR (fdr_personnel.first_name,
                                 INSTR (fdr_personnel.first_name, ' '),
                                 LENGTH (fdr_personnel.first_name)),
                         TRANSLATE (
                            SUBSTR (first_name,
                                    1,
                                    INSTR (first_name || ' ', ' ') - 1),
                            'A.',
                            'A'
                         ),
                         '',
                         SUBSTR (fdr_personnel.first_name,
                                 INSTR (fdr_personnel.first_name, ' '),
                                 LENGTH (fdr_personnel.first_name))
                      ))
                 AS middle_name,
           initcap( fdr_personnel.last_name )
                 AS last_name,
     ltrim(regexp_substr(upper(fdr_personnel.last_name), '\s(JR|SR|II|III)$'))
                 AS suffix,
              DECODE (fdr_personnel.email_addr,
                      ' ', '',
                      CONCAT (fdr_personnel.email_addr, '@edu')),
              fdr_personnel.email_addr AS eca_id,
              fdr_personnel.bldg_code,
              fdr_personnel.room,
              fdr_personnel.mail_station,
              fdr_personnel.pers_phone,
              fdr_personnel.fax_phone,
              fdr_personnel.sex,
              fdr_personnel.race,
              cams_race_codes.cams_race_code,
              cams_race_codes.cams_race_desc,
              fdr_personnel.ADDRESS,
              fdr_personnel.ADDRESS_2,
              fdr_personnel.CITY,
              fdr_personnel.STATE,
              fdr_personnel.ZIP,
              fdr_personnel.HOME_PHONE,
              personal.Status_Code
       FROM   fdr.fdr_personnel, personal, cams_race_codes
      WHERE   (fdr_personnel.race = cams_race_codes.fdr_personnel_race(+))
              AND (fdr_personnel.pers_id = personal.pers_ID)
   ORDER BY   fdr_personnel.pers_ID;
Thank you .that code give me the Suffix.  My additional challenge is to remove the suffix from the last name value.


I originally tried this, so it does not provide clean data for all the records :

   INITCAP(TRANSLATE (
                         SUBSTR (
                            fdr_personnel.last_name,
                            1,
                            INSTR (fdr_personnel.last_name || ' ', ' ') - 1
                         ),
                         'A.',
                         'A'
                      ))
                 AS last_name
Naveen KumarProduction Manager / Application Support Manager

Commented:
assuming the last word as the suffix, can you change the below and try :

initcap( fdr_personnel.last_name ) AS last_name, --> this is in your query

change that to --> as given below

initcap(substr(fdr_personnel.last_name,1,instr(fdr_personnel.last_name,' ',-1)-1)) as last_name

Thanks,

Author

Commented:
Thank you.
Unfortunately the last word is not always the suffix. I applied the code and it works as you described , for cases where there is a suffix. So the code would have to give the lastname unaltered and only in cases where a suffix is split of the last code piece would be used. I guess it would have be be a sort of case select to add some conditional logic. I like to keep the all this functionality with the select statement.
Naveen KumarProduction Manager / Application Support Manager

Commented:
can you post all kind of sample data for the last name column and i can modify the query accordingly.

instead of just giving one name - "de la Rosa Martinez Rodriguez III"

Author

Commented:
Here you go.  These are some sample records to co along with the SQL provided earlier.
Revised-View2.xls
Naveen KumarProduction Manager / Application Support Manager
Commented:
I suggest you to first setup a small table with one column which can include all suffixes you wanted ( this should be a manual task ) because by looking at the last name column, we cannot come to a conclusion that the last word should be the suffix.

Unless we do this manual task completely, any query which we might come up with will will something for sure because your requirement cannot be considered as granted because there is no straight forward way to identify suffix in the last name.

if you take up the suggestion, it will be good to keep inserting new suffix as and when you discover from your data and we can base our query to look for all suffixes present in the table and then split the last name into last without suffix and only suffix. This way it will be a good long term solution as well instead of hard coding JR,III etc in the substr, instr, etc...

Thanks

Naveen KumarProduction Manager / Application Support Manager
Commented:
I mean in the sense if i give you a query today, it will for you all the time as long as you are inserting the new suffixes into that maintenance table.  i would consider this as a workable approach otherwise how do you define that the query we give is going to work for all your data and how long will you modify the queries to include some new suffixes in the substr,instr,etc...

Author

Commented:
ok, I am with you . Setting up the table is fine.
Naveen KumarProduction Manager / Application Support Manager
Commented:
i will give you a sample table and sample suffix table and the query to do this. You can then do this for your query.

Author

Commented:
thank you
Naveen KumarProduction Manager / Application Support Manager
Commented:
just came up with some sample tables to make you understand the query i am going to give.

try this query and it works for me :
--------------------------------------------

select firstname, lastname, suffix,
substr(lastname,1,instr(upper(lastname),suffix,-1)-2) lastname_without_suffix,
substr(lastname,instr(upper(lastname),suffix,-1)) only_suffix
from names , suffixes
where upper(lastname) like '%' || suffix || '%'

this query will pull all the records in which suffix is there, basically to say only those records which need a suffix split.

i assume you have a space before whatever suffix is going to come at end end of the lastname and assuming that we maintain the suffix in the new table without a space.

if for some other data, this does not work, let me know the data.

Thanks,
create table names
( lastname varchar2(100),
  firstname varchar2(100));
  
create table suffixes
( suffix varchar2(30));
 
insert into names values('De Castro Jr','Carlos');
 
insert into names values('Densmore Jr','Leon');
 
insert into names values('Demayo','Francesco');
 
insert into names values('lincoln DR','abraham');
 
insert into names values('Jreshco Ilac JR','Paul');
 
select * from names;
 
insert into suffixes values('JR');
 
insert into suffixes values('MR');
 
insert into suffixes values('DR');
 
insert into suffixes values('III');
 
select * from suffixes;
 
commit;

Open in new window

Naveen KumarProduction Manager / Application Support Manager
Commented:
also i guess your last names will not have 2 suffixes... even if they are there in your last name, the output will bring 2 records for such cases and we can pick which one we want.

basically just insert the below record and then run the query which i gave you.

insert into names values('Tesci JR III','Amber');
commit;

My query will get you 2 records and it will show that either the suffix can be "JR III" or "III" only for those odd data cases. if you want to pick only one here, let me know and i can just put a check to that query to consider which ever you want.

Thanks,

Author

Commented:
I would like one record in that case for now, with the suffix "JR III".  Thank You for thinking about that .
Naveen KumarProduction Manager / Application Support Manager
Commented:
just for such odd cases, we can use the same query but we can say that in those cases we need only the one which you wanted "JR III"

I used basically the same previous query but just said if there are 2 records in the output for the same firstname, lastname then we are saying give me the suffix which has more length and in our  case it will be "JR III" which is 5 characters instead of 'III' which will just have 3 characters.

select * from (
select firstname, lastname, suffix,
substr(lastname,1,instr(upper(lastname),suffix,-1)-2) lastname_without_suffix,
substr(lastname,instr(upper(lastname),suffix,-1)) only_suffix,
row_number() over ( partition by firstname,lastname order by length(substr(lastname,instr(upper(lastname),suffix,-1))) desc ) my_order  
from names , suffixes
where upper(lastname) like '%' || suffix || '%'
)
where my_order = 1
order by firstname, lastname, my_order desc ;
Naveen KumarProduction Manager / Application Support Manager
Commented:
even if you maintain one more new suffix in the suffix table with "JR III" itself, the above given modified query will work fine and it will give you suffix as the "JR III" because the length of this will be 5 and logic holds good there as well.

Author

Commented:
Thank you that works great.

Author

Commented:
I have one more issue to resolve on this :

this is the out
 
 


select * from ( 
select fdr_personnel.first_name, fdr_personnel.last_name, cams_suffixes.suffix,
substr(fdr_personnel.last_name,1,instr(upper(fdr_personnel.last_name), cams_suffixes.suffix,-1)-2) lastname_without_suffix,
substr(fdr_personnel.last_name,instr(upper(fdr_personnel.last_name), cams_suffixes.suffix,-1)) only_suffix,
row_number() over ( partition by fdr_personnel.first_name,fdr_personnel.last_name order by length(substr(fdr_personnel.last_name,instr(upper(fdr_personnel.last_name), cams_suffixes.suffix,-1))) desc ) my_order  
from fdr.fdr_personnel , cams_suffixes
where upper(fdr_personnel.last_name) like '%' || cams_suffixes.suffix /*|| '%' */
)
where my_order = 1 
order by first_name, last_name, my_order desc ; 

Open in new window

revised-View5.xls

Author

Commented:
How do I prevent the last name to be truncated , if they contain the suffix character as the last part of the name? Should I add another table with special names or can it be solved without that?
Naveen KumarProduction Manager / Application Support Manager

Commented:
it is as simple as that to just put a space there in the below sql...

select firstname, lastname, suffix,
substr(lastname,1,instr(upper(lastname),suffix,-1)-2) lastname_without_suffix,
substr(lastname,instr(upper(lastname),suffix,-1)) only_suffix
from names , suffixes
where upper(lastname) like '% ' || suffix || '%' ; --> note a space after the first %

for example, the below last name should not be considered for splitting.

insert into names values('Tesci HelloJR','Hall');
commit;

The above query will not pick this record at all because JR is treated as the last name itself because there was no space before that.

Thanks
 

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial